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TABLE 15: 169 GENES WITH SEQUENCE INFORMATION DEPICTED IN TABLE 16 

Table 15 depicts UnigenelD, UnigeneTitle, Primekey, Predicted Cellular Localization, and 
Exemplar Accession for all of the sequences in Table 16. The information in Table 15 is 
linked by EosCode to Table 16. 



Pkey: Unique Eos probeset identifier number 

ExAccn: Exemplar Accession number, Genbank accession number 

UnigenelD: Unigene number 

Unigene Title: Unigene gene title 

EosCode: Internal Eos name 

Localization: Predicted cellular localization of gene product 



Pkey ExAccn UnigenelD UnigeneTitle 



EosCode 



Localization 



100394 
100452 
101249 
101485 
101514 
101851 
102398 
102522 
102669 
103119 
103709 
104080 
104144 
104691 
105370 
106149 
106579 
107102 
107217 
108153 
109014 
109112 



110151 
112971 
113021 
114908 
114965 
116393 
116416 
117698 
117984 
118985 
119018 
119126 
120992 
121710 
121913 
122041 
122593 
123209 
124526 
126399 
126645 
126966 
127537 
128790 
129109 
129184 
129389 



D84276 

D87742 

L33881 

M24736 

M28214 

M94250 

U42359 

U53347 

U71207 

X63629 

M037316 

AA402971 

AA447439 

M011176 

AA236476 

AA424881 

AA456135 

AA609723 

D51095 

AA054237 

AA1 56790 

M1 69379 

H04649 

H18836 

T17185 

T23855 

AA236545 

AA250737 

AA599463 

AA609219 

N41002 

N51919 

N94303 

N95796 

R45175 

AA398246 

AA419011 

AA428062 

AA431407 

AA453310 

AA489711 

N62096 

AA1 28075 

A1167942 

R38438 

AA569531 

AA291725 

AA491295 

W26769 

AA621604 



Hs.66052 

Hs.241552 

Hs.1904 

Hs.123072 
Hs.82045 

Hs.183556 

Hs.29279 

Hs.2877 

Hs.13804 

Hs.57771 

Hs.183390 

Hs.37744 

Hs.22791 

Hs.256301 

Hs.23023 

Hs.30652 

Hs.40808 

Hs.262036 

HS.257924 

Hs.20843 

HS.31608 

Hs.83883 

Hs.129836 

Hs.54973 

Hs.72472 

Hs.39982 

Hs.45107 

Hs.1 06778 

Hs.55028 

Hs.278695 

Hs.117183 

Hs.97594 



Hs.98732 
Hs.128749 
Hs.203270 
Hs.293185 

Hs.61635 
Hs.1 82575 
Hs.1 62859 
Hs.1 05700 
Hs.1 08708 
Hs.1 09201 



CD38 antigen (p45) PBC1 
KIAA0268 protein PAB7 
protein kinase C, iota OAA1 
selectin E (endothelial adhesion molecul ACC5 
RAB3B, member RAS oncogene family PFJ2 
midkine (neurite growth-promoting factor LBH9 
gb:Human N33 protein form 1 (N33) gene, PDG3 
solute carrier family 1 (neutral amino a PFJ4 
eyes absent (Drosophila) homoiog 2 LEM9 
cadherin 3, type 1 , P-cadherin (placenta LBG2 
hypothetical protein dJ462023.2 PD06 
kallikreinU PBA6 
hypothetical protein FLJ13590 PDM3 
Homo sapiens beta-1 adrenergic receptor PAV1 
transmembrane protein with EGF-like and PDM9 
hypothetical protein MGC1 31 70 PD08 
ESTs PAA4 
KIAA1 344 protein PAA3 
DKFZP586E1621 protein PDG8 
ESTs PBF1 
ESTs, Weakly similar to Z223.HUMAN ZINC 
hypothetical protein FLJ13782 BCU4 
Homo sapiens cDNA FLJ1 1245 fis, clone PL 
hypothetical protein FLJ20041 PAV9 
transmembrane, prostate androgen induced 
KIAA1 028 protein PD03 
cadherin-like protein VR20 PFJ6 
ESTs BCY2 
hypothetical protein MGC2648 PDV3 
ESTs OAB6 
ESTs PDT9 
ATPase, Ca++ transporting, type 2C, memb 
ESTs, Weakly similar to 154374 gene NF2 PDM8 
Homo sapiens prostein mRNA, complete cds 
ESTs PBF8 
KIAA1210 protein PDG5 
prostate androgen-regulated transcript 1 PDV5 
ESTs; protease inhibitor 15 (PI15) BCU7 
Homo sapiens Chromosome 16 BAC clone CIT 
alpha-methylacyl-CoA racemase PD01 
ESTs, Weakly similar to ALU1.HUMAN ALU S 
ESTs, Weakly similar to JC7328 amino aci PAV4 
transmembrane, prostate androgen induced 
six transmembrane epithelial antigen of PAA5 
solute carrier family 1 5 (H+/peptide tra PD05 
ESTs PAA6 
secreted frizzled-related protein 4 BCX2 
calcium/calmodulin-dependent protein kin PFJ7 
CGI-86 protein PAV6 
spondin 2, extracellular matrix protein CJA5 



plasma membrane 
not determined 
cytoplasmic 
plasma membrane 
cytoplasmic 
secreted 

plasma membrane 
cytoplasmic 
plasma membrane 

secreted 

plasma membrane 
plasma membrane 

plasma membrane 
not determined 

plasma membrane 
PDG7 

not determined 
PDG4 

plasma membrane 
CHA1 not determined 

plasma membrane 

mitochondrial 

secreted 

ER 

PAJ5 not determined 
PAB2 plasma membrane 



vesicular 

PAZ1 not determined 

PAA2 plasma membrane 
plasma membrane 
PDY4 

plasma membrane 
plasma membrane 
not determined 
secreted 

vesicular 
not determined 
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129404 


AA1 72056 




129534 


R73640 


Hs. 11260 


130760 


M1 28997 


Hs.1 8953 


131425 


AA219134 


Hs.26691 


132964 


M031360 




132967 


AA032221 


Hs.61635 


133179 


U81599 


Hs.66731 


133330 


U42360 


Hs 71119 


133520 


X74331 


Hs.74519 


133724 


U07919 


Hs.75746 


133724 


U07919 


Hs 75746 


133944 


AA045870 


Hs.7780 


134110 


U41060 


Hs 79136 


301805 


AI800004 


Hs 142846 


302005 


AI869666 


Hs.123119 


302881 


AA508353 


Hs.105314 


303506 


AA340605 


Hs. 105887 


303699 


D30891 


Hs. 19525 


303753 


AW503733 


Hs.9414 


308050 


AI460004 


Hs 31608 


310382 


AI734009 


Hs. 127699 


310431 


A1420227 


Hs 1 49358 


310573 

W 1 WW / w 


AW292180 


Hs 156142 


O 1 \JOtJO 


AloooniQ 

ASIOOOv IO 


no. I HUOHO 


O 1 UO 1 o 


Mlo/ouot 


no.tit'+iJDO 


^1 1 ^Qfi 


AiCQpnpo 

MIDOtUOO 


H«s 7Q°,7R 
no, / so / 0 


°,1°.fi7fi 

O lOO/O 


AAftfi1fiQ7 


no, 1 tUOa 1 


314121 


rsi r 0£ 1 \Aj 


no. 10/DI9 


314691 


AW207206 


Hs.1 36319 


0 If/03 


AI538226 


Hs.32976 


314907 


AI672225 


Hs.222886 


O 1 DVD i 


AW292425 




°.1 sn^p 


AA876910 


Hs.1 34427 


316442 


AA760894 


Hs.153023 


q 171:4a 

J 1 /340 


AI654187 


Hs.1 95704 


qi7Qca 
O I / 003 


AW295184 


Hs.129142 


Q1QCO/1 


AW291511 


Hs.159066 


o i y i a i 


AF071538 




01Q7CO 
O I y / DO 


AA460775 


Hs.6295 


qonQOA 

OZXiOd.** 


AF071202 


Hs.1 39336 


qonCRl 


NM_006953Hs.159330 


oofi7Qfi 


AF038966 


Hs.31218 


O.Ol>f/M 
0^144 I 


AW297633 Hs.1 18498 




W07459 


Hs.157601 


QOO700 


AA056060 


Hs.202577 


oooqiq 


AW043782 Hs.293616 


qOOOOfi 
0£0££0 


AF055019 


Hs.21906 




AA639902 


Hs.104215 


0040QC 


AI146686 


Hs.143691 


Ot*t*fOU 


AA464018 


Hs.1 84598 


qoAftAQ 
0&40UO 


AW016378 Hs.292934 


jtWI / 


AA508552 


Hs.1 95839 


qOAROA 
0<i40£O 


AI685464 






AI694767 


Hs.129179 


004.71 ft 
ot<*/ lo 


AI557019 


Hs.1 16467 


qqnoi 1 






qoncitc 
OOUD40 


U31382 


Hs.299867 


330762 


AA449677 


Hs.15251 


330790 


T48536 


Hs.122764 


330892 


AA149579 


Hs.91202 


331099 


R36671 


Hs.14846 


331490 


N32912 


Hs.291039 


331889 


AA431407 


Hs.98802 


332247 


N58172 




332396 


AA340504 




332697 


T94885 




332798 






334447 






338255 







ESTs PAB4 
hypothetical protein FLJ 1 1 264 PAJ3 
phosphodiesterase 9A PEE6 
ESTs PBA7 
ESTs PAA7 
six transmembrane epithelial antigen of PM1 7 
homeo box B13 PFJ5 
Putative prostate cancer tumor suppresso PDM1 
primase, polypeptide 2A (58kD) PDM2 
aldehyde dehydrogenase 1 family, member 
aldehyde dehydrogenase 1 family, member 
Homo sapiens mRNA; cDNA DKF2p564A072 (fr 
LIV-1 protein, estrogen regulated BCR4 
hypothetical protein PEU4 
MAD (mothers against decapentaplegic, DrPBJ6 
relaxin 1 (H1) PBH3 
ESTs, Weakly similar to Homolog of rat Z PEG4 
hypothetical protein FU22794 PBM4 
KIAA1 488 protein PBY3 
hypothetical protein FLJ20041 PEU5 
KIAA1 603 protein PCQ8 
ESTs, Weakly similar to A46010 X-linked PBH1 
ESTs PEN3 
ESTs PCW3 
ESTs PETS 
holocarboxylase synthetase (biotin-[prop PBH8 
ESTs PBY2 
ESTs PBY1 
ESTs BFF8 
guanine nucleotide binding protein 4 CB07 
ESTs, Weakly similar to TRHY.HUMAN TRICH 
ESTs PBM9 
ESTs PBJ7 
ESTs PBJ9 
ESTs PBQ6 
deoxyribonuclease II beta PBQ7 
hypothetical protein FLJ 101 88 PBJ1 
prostate epithelium-specific Ets transcr PEN 1 
ESTs, Weakly similar to T1 7248 hypotheti PE07 
ATP-binding cassette, sub-family C (CFTR PBH5 
uroplakin 3 PEL9 
secretory carrier membrane protein 1 PBY4 
Homo sapiens LUCA-15 protein mRNA, splic 
ESTs CBF9 
Homo sapiens cDNA FLJ12166 fis, clone MA 
ESTs PCQ7 
Homo sapiens clone 24670 mRNA sequence 
ESTs, Moderately similar to SPCN.HUMAN S 
ESTs PBQ9 
Homo sapiens cDNA: FLJ23241 fis, clone C 
ESTs PBM3 
ESTs, Weakly similar to I38022 hypotheti PBH4 
gb:tt88f04.x1 NCI_CGAP_Pr28 Homo sapiens 
Homo sapiens cDNA FLJ13581 fis, clone PL 
small nuclear protein PRAC CBK1 

PBJ2 

guanine nucleotide binding protein 4 PEW1 
hypothetical protein PBM1 
TMPRSS2, transmembrane protease, serine 
ESTs PBQ4 
Homo sapiens mRNA; cDNA DKF2p564D016 (fr 
ESTs PCI4 
ESTs, Moderately similar to T14342 NSD1 PBH7 
gb:za21 f09.s1 Soares fetal liver spleen PBQ5 
gb:hw31a09,x1 NCI_CGAP_Kid11 Homosapien 
transgelin 2 PBQ8 

PBH2 
PBY9 
PBY7 



secreted 
nuclear 

plasma membrane 
plasma membrane 
nuclear 

plasma membrane 

PDT1 mitochondrial 
PDT1 mitochondrial 
PAB9 cytoplasmic 
plasma membrane 
nuclear 
cytoplasmic 
secreted 

not determined 
not determined 
plasma membrane 

plasma membrane 
plasma membrane 



not determined 
cytoplasmic 
PBM2not determined 

plasma membrane 



cytoplasmic 



plasma membrane 
plasma membrane 
not determined 
PBY8 not determined 
secreted 

PBQInot determined 
plasma membrane 
PCI2 not determined 
PBJ5 

not determined 
PBY6 not determined 

cytoplasmic 
PCW6 

PBJ4 plasma membrane 
nuclear 

not determined 

cytoplasmic 

not determined 

PEL3 plasma membrane 

plasma membrane 

PCQ1 cytoplasmic 

nuclear 

not determined 
nuclear 

PBJ8 not determined 

secreted 

nuclear 

not determined 
not determined 
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401424 PFG2 

407122 H20276 Hs.31742 ESTs PEW7 

408430 S79876 Hs.44926 dipeptidylpeptidase IV (CD26, adenosine PEZ3 

408826 AF216077 Hs.48376 Homo sapiens clone HB-2 mRNA sequence 

409262 AK000631 Hs.52256 hypothetical protein FU20624 PFG1 

409361 NM„005982Hs.5441 6 sine oculis homeobox (Drosophila) homolo PEW3 

411096 U80034 Hs.68583 mitochondrial intermediate peptidase PEZ9 

413125 BE244589 Hs.75207 glyoxalase I PFJ3 

413623 AA825721 Hs.246973 ESTs OBH6 

414422 AA147224 Hs.337232 HomeoboxA13 PFC6 

415263 AA948033 Hs.130853 ESTs PEZ5 

417153 X57010 Hs.81343 "collagen, type II, alpha 1 (primary ost PFJ1 

418601 AA279490 Hs.86368 calmegin PFA1 

418848 AI820961 Hs.1 93465 ESTs PEY4 

418882 NM_004996Hs.89433 ATP-binding cassette, sub-family C (CFTR OBH2 

419839 U24577 Hs.93304 "phospholipase A2, group VII (plate let-a PFH9 

421887 AW161450 Hs.109201 CGI-86 protein PFH2 

422083 NM_001141Hs.111256 "arachidonate 15-lipoxygenase, second ty PFH5 

424565 AW1 02723 Hs.75295 guanylate cyclase 1 , soluble, alpha 3 PFA3 

425071 NM_013989Hs.154424 "deiodinase, iodothyronine, type II" PFH6 

425710 AF030880 solute carrier family, member 4 PFD4 

427958 AA418000 Hs.98280 potassium intermediate/small conductance PFH1 

428819 AL135623 Hs.193914 KIAA0575 gene product PFD6 

429900 AA460421 Hs.30875 ESTs PEZ7 

429918 AW873986 Hs.1 19383 ESTs PEY5 

430226 BE245562 Hs.2551 adrenergic, beta-2-, receptor, surface PEZ4 

431217 NM_013427Hs.250830 Rho GTPase activating protein 6 PFG6 

431716 D89053 Hs.268012 fatty-acid-Coenzyme A ligase, long -chain PEZ1 

431992 NM_002742Hs.2891 protein kinase C, mu PFH4 

432189 AA527941 gb:nh30c04.s1 NCLCGAP_Pr3 Homo sapiens 

432244 AI669973 Hs.200574 ESTs PEW8 

432437 W07088 Hs.293685 ESTs PFG3 

432966 AA650114 Hs.325198 ESTs PEY3 

439176 AI446444 Hs.190394 ESTs, Weakly similar to B28096 line- 1 pr PEW5 

440260 AI972867 Hs.7130 copinelV PEW6 

440901 AA909358 Hs.1 28612 ESTs PFC8 

445424 AB028945 cortactin SH3 domain-binding protein PEZ6 

446320 AF1 26245 Hs.1 4791 "acyl-Coenzyme A dehydrogenase family, m 

447210 AF035269 phosphatidylserine-specific phosphoiipas PFH8 

449156 AF1 03907 Hs.1 71 353 prostate cancer antigen 3, non-coding DD PEZ8 

449625 NMJH4253 odz (odd Oz/ten-m, Drosophila) homolog 1 PEZ2 

449650 AF055575 Hs.23838 calcium channel, voltage-dependent, L ty PFD2 

451939 U80456 Hs.27311 single-minded (Drosophila) homolog 2 PFJ8 

451982 F13036 Hs.27373 Homo sapiens mRNA; cDNA DKFZp56401763 (f 

452039 AI922988 ESTs PFD8 

452340 NM_002202Hs.505 ISL1 transcription factor, LIM/homeodoma PFG4 

452784 BE463857 Hs.151258 hypothetical protein FLJ21 062 PFC5 

452946 X95425 Hs.31092 EphA5 PFH3 



mitochondrial 

plasma membrane 

PEY1 

nuclear 

nuclear 

mitochondrial 

cytoplasmic 



secreted 
ER 



secreted 

plasma membrane 
cytoplasmic 

secreted 

plasma membrane 
plasma membrane 
nuclear 



plasma membrane 
nuclear 

cytoplasmic 
PFA2 



PFH7 



plasma membrane 
plasma membrane 

PFG9 plasma membrane 

nuclear 
cytoplasmic 
plasma membrane 
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TABLE 15A shows the accession numbers for those primekeys lacking a unigenelD in Table 
15. For each probeset we have listed the gene cluster number from which the 
oligonucleotides were designed. Gene clusters were compiled using sequences derived from 
Genbank ESTs and mRNAs. These sequences were clustered based on sequence similarity 
using Clustering and Alignment Tools (DoubleTwist, Oakland California). The Genbank 
accession numbers for sequences comprising each cluster are listed in the "Accession" 



Pkey CAT number Accession 

1 16393 131543J AI972402 AI634409 A1523716 AI799749 W44518 AI424438 AI688513 AI971048 AI686324 AW013854 AA588483 AA52811 1 AI627428 



AI582200 AI669296 AI826926 AI620526 AI669958 A1972458 AI924500 AA512903 W44517 AA335363 AW238997 BE300165 
BE250665 AA284195 AA523420 W52834 AI471 970 AI952824 AW003820 AW009463 M669796 AA1 1 4966 AI653342 AA1 15038 
AI342150 AI092100 AI96821 1 W51994 AI804005 AI201420 AI123210 AI738405 AI674964 AI970341 AW027500 AI493316 AI333193 
AI139353 AA599463 AI656163 AI804200 AI365321 AI990213 AI65701 1 M650025 AI968810 AI341978 AA599839 AW592602 
AA644289 AI468578 AI565265 AI565228 BE221535 AW973052 



101485 18113J AA296520 AL021940 M30640 NMJJ00450 M24736 M61894 AL047443 H39560 AI694691 AA916787 AI214796 AA939085 AI150616 

AA4 12553 AA412545 AI051015 T27654 AA694430 
126399 17331_1 AA088767 AF224278 AA128075 AL035541 AA027926 AI761441 AI972096 AW071693 A1742327 AI377498 AI804815 AI640802 



AI885001 AI921394 AA5951 15 N71820 AI921217 AW007283 AI467828 AI369306 AA917446 AI493698 M088701 M126899 AI936228 
AW204238 AI039567 AI925027 BE1 38909 AW452945 AW1 35998 AA310984 AA027860 AW073519 AI537597 AA953976 AI521341 
AW273569 AW050740 AA5361 1 3 AA559064 AI474392 AW1 35709 AA535181 AW572959 AA570597 AI905464 AI67781 0 AI587642 
AW975102 M424310 AA482527 N64192 AA658276 AW889117 AA486591 AW889172 AI381990 AI381991 AI673419 AI990950 
AA487031 AI272934 AI150565 AA229168 AW316722 AI142707 BE222396 AA614168 AA122026 AW338227 AA632457 AI968726 
AW369662 AA512956 AA541675 AA451748 AI250993 BE146418 AA122025 



132964 94346J AI362575 AI805082 AW263421 AI432462AA1 35870 AA031 360 AA03 1604 AA298475 AA298464 

129389 21074J NM_012445 AB027466 BE407510 BE047605 AA047125 AW084003 AA149494 AA149490 AA292528 AA570505 AA526186 AW006250 



AW007762 AI341557 AI799666 AI972710 AI377966 AI96281 0 AI084783 AI458032 AI190971 AW148913 AA372354 AW970032 
AW007426 AA650188 AI123203 AI122890 AI280975 W73595 W73495 AI863238 AA374109 AA603986 AW149089 AW957523 
AI307748 AI921 067 AI336463 F24537 AI380460 AI367500 AM 89309 AI814701 AI766921 AW572106 AA037024 AW072576 AA578293 
AI288103 AA235464 AW450642 AA574230 AW294024 AI589229 AI580733 AW5 12227 AA877009 AI660255 AW1 88597 AA558228 
AI572782 AA658397 AI274628 AI866359 AA864573 AI264439 AA621604 AW515493 AW243333 Z39737 AI567038 AA573997 
AA573559 AW236431 AI652870 AI684973 AA034505 AA047126 



129404 156454.1 AI267700 AI720344 AA1 91424 AI023543 AI469633 AA1 72056 AW958465 AA1 72236 AW953397 AA355086 

1 0721 7 9836J AL080235 AA031 750 D81 382 AI480231 AI095947 AI560953 BE01 0721 AI870290 AA374945 AA1 25792 D51527 D51 556 AI685541 



D51559 AW1 17286 AA195741 AI675138 AW593439 AI201885 T30590 AW952100 D51095 AA523864 W70043 AA987586 AI421515 
AI205532 AA127069 AI337367 D51595 AI453785 AW075677 AW088359 C14287 C14284 



121710 19266J AF163474 NM_016590 AF163475 AI761105 AI770098 AA410580 AA41 1616 AI590343 AI739050 AL050198 AI862645 AA419104 



AA513809 AA333032 AI816915 AW139625 AA640889 AI31 1391 AI627693 AW135514 AA41 901 1 AI269149 AI245259 A1970008 
AI970017 AW139445 AA569503 AI761072 AI766179 AI759995 AI300776 AI870129 AW150770 AA226501 AA226220 



121913 291015J AI249368 AI742316 AA428062 AA442089 AI864189 BE349478 AI803475 AI584049 BE552085 A1088609 AI264197 AI886144 A1129474 

AI307145 BE181300 AW058403 AI696838 AW748598 AA442196 AI216428 
102398 entrez_U42359U42359 

315051 347217.1 AW292425 BE467167 AI702953 BE550961 BE222309 AI299348 AI693336 AA541708 
324626 336411 J AI685464 AW971 336 AA5 13587 AA5251 42 

319191 16085J NM.012391 AF071538 AB031 549 AI685592 AI745526 AA662204 AW1 30657 AA6621 64 AW971 121 A1668916 AA513274 AI991223 



AI979170 AW298436 AA639821 AI859010 AW513942 AI687669 AA662521 AA548598 AI345056 AI305374 BE043418 AI432856 
AI334840 AI379796 AI492693 AI307915 BE042082 AI307834 AI307858 AI309488 BE042210 AI435670 AI371605 AI862491 AI284563 
AI306872 AI255044 AI254601 A1251236 A1473073 AI473042 AI432760 AI435664 A1336826 AI289365 AI369096 AI862274 AI334871 
AI349863 AI250405 AI377617 AI309895 AI313017 AI862291 AI31 1936 AI378718 AI305722 AI306769 AI308888 AI334565 AI862296 
AI344230 AI435685 AI344087 AI378696 A131 1209 AI435775 AI31061 1 AI31 1 154 AI432289 AI431561 AI492681 AI432867 A1335288 
AI492796 AI432769 AI310299 AI432273 AI379820 AI275319 AI435753 AI609441 AI432767 AI369100 AI31 1420 AI349974 AI247157 
AI334677 AI270910 AI224320 AI305608 AI334489 AI377152 AI350012 AI370086 AI335053 AI306781 AI306750 AI334849 AI334874 
AI340380 AI307876 AI305974 AI305972 A131 1521 AI334872 AI862509 AI31 1498 AI335051 AI289684 AI310859 AI311862 AI862483 
AI492775 AI307906 AI492708 AI289693 AI340373 AI30791 0 AI31 1359 AI435653 AI334865 AI31 1492 AI492809 AI492690 AI431576 
AI862268 AI31 1879 AI308435 AI492792 AI862512 AI275321 AI431568 AI431564 AI307885 AI307926 AI435692 AI435778 AI310182 
AI308894 AI492707 AI492713 AI308560 AI307829 AI343234 AI580598 AW472796 A1340918 AI310243 AI309368 AJ307920 AI289665 



column. 



Pkey: 

CAT number: 
Accession: 



Unique Eos probeset identifier number 
Gene cluster number 
Genbank accession numbers 
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A1306777 AW08631 8 AW086292 AW086378 AI310027 AI275293 AI369082 AI340900 AI306749 AI371558 AW086287 BE043803 
AI306793 AI306272 AI287948 AI270917 AI284816 AI336813 AI284546 AI308044 AI275290 AI270872 AI306795 AI289687 AI223570 
AI305303 AI289677 AI287742 A1275284 AI306812 AI336701 A1371554 AI378719 AI344988 AI223631 AI335141 AI343222 AI284568 
AI305357 AI275270 AI345932 AI436549 AI307925 AI311502 AI344238 AI343182 AI308508 AI305988 AI270790 AI379792 AI305647 
AI305410 AI432251 AI436517 AI343227 AI305534 AI340387 AI271043 AI305499 AI271046 AI305962 AI289465 AI305378 AI289725 
AI310848 A1305848 AI289362 AI252964 AI307049 AI310831 AI306993 A1306796 AI224659 AI305969 AI349855 AI306164 AI306948 
AI284676 AI309155 AI343202 AI432785 AI306815 AI369081 AI270885 A1289699 AI435704 AI309647 AI305716 AI31 1281 AI287927 
AI472995 A1340423 A1270958 AI307069 AI305364 A1270807 AI275306 AI31 1 890 AI275263 AI432750 AI289371 AI432861 AI2551 13 
AI305709 AI473008 AI311168 AI309711 AI377164 AI271201 AI289560 AI309710 AI306195 AI311201 AI287741 AI271066 AI432876 
AI275281 AI379795 AI472972 AI31 1 967 A1306826 AI305465 AI270792 AI47301 9 AI305340 AI270922 AI305995 AI305462 A1254144 
AI270969 AI473012 AI305390 A1275278 AI223644 AI289692 AI250318 AI305372 AI289691 AI250521 AI306283 AI306814 AI307933 
AI4731 60 AI432903 AI223720 AI254979 AI334862 AI306926 AI289541 AI432248 AI435722 AI435698 AI432859 AI310683 AI4731 75 
AI335144 AI289467 AI436489 AI306928 AI473033 AI305763 AI307868 AI307882 AI348959 AI435736 AI432857 AI432896 AI435735 
AI432283 AI473086 AI432863 AI473081 AI432825 AI307840 AI473164 AI432885 AI473166 AI472982 AI435734 AI473060 AI473171 
AI432279 AI432882 AI334670 A1436512 A1432827 AI432852 AI473051 AI473077 A1435697 AI271509 AI492781 AI472983 AI473018 
AI432897 AI473043 AI432871 AI436536 AI473157 AI34971 5 AI432777 AI473016 AI473158 AI340369 AI307941 AI432773 AI377146 
AI492791 AI270950 AI305342 AI284604 AI306269 AI28481 1 AI27081 1 AI289347 A1334869 AI334852 AI31 1759 AI250382 AI309520 
AI289550AI305721 AI340870 AI270901 AI308575 AI307904 AI340715 AI270941 AI309808 AI246867 AI473014 AI307039 AI289360 
AI473069 AI492786 AI34401 3 AI305876 AI436510 AI340742 AI473028 AI307891 BE041871 BE041268 BE042340 BE041946 
BE041783 AI306173 AI201948 A1926972 AI275769 

338255 CH22_6856FG__LINK_EM:AC00 

330211 c_5_p2 

332798 CH22_14FG_6_5_LINKC4G1.G 
334447 CH22_1746FG_387_7_L1NK_EM 

332247 372969.1 AA669097 AA513815 AA026798 AA676526 AA704429 AA704269 AW118292 AA579216 N58172 

332396 20265J AW579842 BE156562 BE156690 BE156489 BE081033 AK001559 BE149402 M85387 AW36781 1 AW367798 R17370 A1908947 

AA382932 R58449 H18732 AA371231 AW962899 AA713530 AW892946 R53463 H1 1063 AW068542 Z40761 BE176212 BE176155 
W23952 W92188 AW374883 AA303497 AW954769 AA036808 BE168063 AW382073 AW382085 AL041475 H80748 AI078161 
BE463983 AI805213 AI761264 W94885 N94502 AI623772 AI419532 AI810302 AI634190 AW002516 AW150777 AI352312 AI367474 
AW204807 AI675502 AI337026 AW134715 BE328451 AI123157 AI560020 A1300745 A1608631 AI248873 AA742484 AW051635 
H18646 AI245045 AA507111 AI64051 0 AI925594 AA1 15747 AA143035 AA151 106 

332697 13699J X51405 NM„001873 T1 1322 AL1 18886 BE328175 AW136009 BE467445 AW470313 AA774852 BE504139 AW501046 AA082792 
AW389231 AA370044 R36841 AA371457 C04813 R25791 R25556 AW895854 AW90381 9 AW895671 AW895677 BE159723 
AW895664 AW895597 AW895595 AW895665 AW888518 AI903724 F06081 F08503 AL1 19462 AW895730 AW888516 R2651 1 
R26489 AA334126 M327626 N85713 AW895998 AA223622 F05468 AA370749 W05590 M78202 AA371073 AW498607 R15017 
T16991 AA001282 AA001 138 AA551566 AA330159 AI922855 AA383512 AA029603 D82246 D82171 T94933 H56545 AA348060 
AA176888 R96764 AW451817 AA385766 AA452618 AI690057 AA988822 BE549928 AA150901 W57992 AW899925 C05281 
AA932042 AA370980 AW962877 W04741 AA369982 AW385948 AA922466 N75882 AI422070 AI361256 AI680224 D57122 T94885 
R53266 R46713 T19071 AW796277 AA325333 F04719 F02334 AA358146 AA626597 AA358304 AW028099 AL1 19570 D57290 
D58273 D57796 N48555 AI361969 M329457 D57225 AW024046 AA992606 AW0221 18 AW021538 AA935845 H89870 H56546 
AW961219 AA453239 AW837541 N45521 BE21 8029 M31 8877 AA327740 AW961 809 T92 139 D532 16 D52365 D53363D53312 
D531 16 AI547267 AA679935 AW026552 AW026418 AW190507 AI927710 AW244108 D50948 AW054991 AW021063 AW022511 
AA493436 AI365636 BE464751 AW149384 AA102442 AW771368 A1818251 AI126368 D51049 AI421542 AI559467 AW079779 
AW021048 AW023969 AW044214 AI458264 AA027274 AI620254 AW028917 BE219511 AA326242 N67561 AI971273 AA878328 
D57131 AA770662 AI309299 AI796767 M613338 W58076 AI566287 AI445573 A1880260 AA001919 AW339259 A1492610 A1492611 
R97692 AI301425 AA722603 D58361 AI350323 AA973926 AI431263 AA516126 AA865467 AI925177 N39443 AA001943 AI299371 
AI082412 AA665090 AA583433 H89871 AA977231 AI362219 AI056096 AI270446 N67524 N22103 AW614224 AA744054 AW243622 
AI6131 88 AI929173 AI350243 AI362138 AA744004 AA1 76661 D56787 AI955625 AI393109 AI094769 AI479728 AI423107 AI955617 
AI034036 AI582196 AW264534 AI418961 AA570761 AI343538 AA650341 AA992503 AA770004 AL039666 AI862675 AW190335 
AA610274 AW418627 BE467472 D56786 T28749 AI217610 AI359556 T23523 AL040189 AA846222 M651636 D51280 AI888986 
AI521 167 AI340177 AW612815 AI625285 AA621607 AA177059 AA229768 AA829788 AI749682 AW190631 N75299 AA230089 
AI915632 BE069542 AA890020 AA528397 M995390 BE503860 AA570812 AW339396 AI197986 AI203725 AI282379 AA670375 
AA461513 F01728 AW243599 C00856 N75567 R95995 AA150932 R95961 AA648060 AA933800 AA927073 AA101126 AA864190 
T93566 BE1 67472 

425710 25529J AF030880 NM_000441 AC002467 AA385554 H23053 AW891838 AI139968 AA653057 AI695233 
432189 342819 J AA527941 AI810608 AI620190 AA635266 

445424 6391J AB028945 T77648 F13328 AL157605 Z46212 AA304736 F1 1 855 T66098 T301 74 AW954164 AW176301 AW748243 AA456428 
AI369958 AA938565 AW959613 Z42008 M994779 AI683909 F11019 F10926 AI769597 AI752550 T65015 AI884314 AA643954 
Z41838 AW020147 AI038822 AW571822 AA299781 AA894928 AF131790 BE00541 1 AI902476 AW082695 AA464384 R42750 
AW902301 AA464273 R05837 Z38294 H41098 AL134507 M86079 

447210 7119J AF035269 AF035268 NMJJ15900T96213 U37591 AA 1 56832 AA299371 A1084325 H95977 AI765967 BE221465 AA156726 AI969563 
AW024539 AI436791 AI949451 AA843093 AI452756 AA824232 AI306667 T96131 AW207447 AW243556 AW957032 AI084332 
H95978 U30998 

449625 81 13_1 NM_014253 AF100772 BE088769 AL022718 BE161779 AW863569 BE161640 AL039060 BE168542 AW296554 AA323193 AA235370 
AW779760 N48674 AI375997 R45432 D59344 AI203107 F07491 R35360 R25094 AI913631 AI498402 T61382 AI016320 N45526 
T61415 AA331486 

452039 89513.1 AI922988 H05475 AA021608 AW169947 AA91 3750 Z41 614 AW800012 
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TABLE 15B shows the genomic positioning for those primekeys lacking unigene ID's and 
accession numbers in Table 15. For each predicted exon, we have listed the genomic 
sequence source used for prediction. Nucleotide locations of each predicted exon are also 
listed. 

Pkey: Unique number corresponding to an Eos probeset 

Ret: Sequence source. The 7 digit numbers in this column are Genbank identifier (Gl) numbers. "Dunham I. et al." refers to the 

publication entitled The DNA 

sequence of human chromosome 22." Dunham I. et al., Nature (1999) 402:489-495. 
Strand: Indicates DNA strand from which exons were predicted. 

Nt_position: Indicates nucleotide positions of predicted exons. 



Pkey Ref 



Strand 



Nt_position 



334447 Dunham, I. etal. Plus 

332798 Dunham, I. etal. Minus 

338255 Dunham, I. etal. Minus 

330211 6013592 Plus 

401424 8176894 Plus 



14308764-14308824 

232147-231974 

15242294-15242231 

59158-59215 

24223-24428 
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TABLE 11 AND SEQUENCE LISTING 



SEQIDN0:1 BCU4 DNA SEQUENCE 

Nucleic Acid Accession #: NM_02491 5 

Coding sequence: 13-1890 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I ! I I I 

ATTGGATCAA AC ATGTCACA AGAGTCGGAC AATAATAAAA GACTAGTGGC CTTAGTGCCC 60 
ATGCCCAGTG ACCCTCCATT CAATACCCGA AGAGCCTACA CCAGTGAGGA TGAAGCCTGG 120 
A AGTC ATACT TGG AG AATCC CCTG ACAGCA GCCACCAAGG CCATGATGAT CATTAATGGT 1 80 
GATGAGGACA GTGCTGCTGC CCTCGGCCTG CTCTATGACT ACTACAAGGT TCCTCGAGAC 240 
AAGAGGCTGC TGTCTGTAAG CAAAGCAAGT GACAGCCAAG AAGACCAGGA GAAAAGAAAC 300 
TGCCTTGGCA CCAGTGAAGC CCAGAGTAAT TTGAGTGGAG GAGAAAACCG AGTGCAAGTC 360 
CTAAAGACTG TTCCAGTGAA CCTTTCCCTA AATCAAGATC ACCTGGAGAA TTCCAAGCGG 420 
GAACAGTACA GCATCAGCTT CCCCGAGAGC TCTGCCATCA TCCCGGTGTC GGGAATCACG 480 
GTGGTGAAAG CTGAAGATTT CACACCAGTT TTCATGGCCC CACCTGTGCA CTATCCCCGG 540 
GGAGATGGGG AAGAGCAACG AGTGGTTATC TTTGAACAGA CTCAGTATGA CGTGCCCTCG 600 
CTGGCCACCC ACAGCGCCTA TCTCAAAGAC GACCAGCGCA GCACTCCGGA CAGCACATAC 660 
AGCGAGAGCT TCAAGGACGC AGCCACAGAG AAATTTCGGA GTGCTTCAGT TGGGGCTGAG 720 
GAGTACATGT ATGATCAGAC ATCAAGTGGC ACATTTCAGT ACACCCTGGA AGCCACCAAA 780 
TCTCTCCGTC AGAAGCAGGG GGAGGGCCCC ATGACCTACC TCAACAAAGG ACAGTTCTAT 840 
GCCATAACAC TCAGCGAGAC CGGAGACAAC AAATGCTTCC GACACCCCAT CAGCAAAGTC 900 
AGGAGTGTGG TGATGGTGGT CTTCAGTGAA GACAAAAACA GAGATGAACA GCTCAAATAC 960 
TGGAAATACT GGCACTCTCG GCAGCATACG GCGAAGCAGA GGGTCCTTGA CATTGCCGAT 1020 
TACAAGGAGA GCTTTAATAC GATTGGAAAC ATTGAAGAGA TTGCATATAA TGCTGTTTCC 1080 
TTTACCTGGG ACGTGAATGA AGAGGCGAAG ATTTTCATCA CCGTGAATTG CTTGAGCACA 1140 
GATTTCTCCT CCCAAAAAGG GGTGAAAGGA CTTCCTTTGA TGATTCAGAT TGACACATAC 1200 
AGTTATAACA A TCGTAGC A A TAAACCCATT CATAGAGCTT ATTGCCAGAT CAAGGTCTTC 1260 
TGTGACAAAG GAGCAGAAAG AAAAATCCGA GATGAAGAGC AGAAGCAGAA CAGGAAGAAC 1320 
GGGAAAGGCC AGGCCTCCCA AACTCAATGC AACAGCTCCT CTGATGGGAA GTTGGCTGCC 1380 
ATACCTTTAC AGAAGAAGAG TGACATCACC TACTTCAAAA CCATGCCTGA TCTCCACTCA 1440 
CAGCCAGTTC TCTTCATACC TGATGTTCAC TTTGCAAACC TGCAGAGGAC CGGACAGGTG 1500 
TATTACAACA CGGATGATGA ACGAGAAGGT GGCAGTGTCC TTGTTAAACG GATGTTCCGG 1560 
CCCATGGAAG AGGAGTTTGG TCCGGTGCCT TCAAAGCAGA TGAAAGAAGA AGGGACAAAG 1620 
CGAGTGCTCT TGTACGTGAG G AAGGAGACT GACGATGTGT TCGATGCATT GATGTTGAAG 1680 
TCTCCCACAG TGATGGGCCT GATGGAAGCG ATATCTGAGA AATATGGGCT GCCCGTGGAG 1740 
AAGATAGCAA AGCTTTACAA GAAAAGCAAA AAAGGCATCT TGGTGAACAT GGATGACAAC 1800 
ATCATCGAGC ACTACTCGAA CGAGGACACC TTCATCCTCA ACATGGAGAG CATGGTGGAG 1860 
GGCTTCAAGG TCACGCTCAT GGAAATC TAG CCCTGGGTTT GGCATCCGCT TTGGCTGGAG 1920 
CTCTCAGTGC GTTCCTCCCT GAGAGAGACA GAAGCCCCAG CCCCAGAACC TGGAGACCCA 1980 
TCTCCCCCAT CTCACAACTG CTGTTACAAG ACCG TGCTGG GGAGTGGGGC AAGGGACAGG 2040 
CCCCACAGTC GGTGTGCTTG GCCCATCCAC TGGCACCTAC CACGGAGCCG AAGCCTGAGC 2100 
CCCTCAGGAA GGTGCCTTAG GCCTGTTGGA TTCCTATTTA TTGCCCACCT TTTCCTGGAG 2160 
CCCAGGTCCA GGCCCGCCAG GACTCTGCAG GTCACTGCTA GCTCCAGATG AGACCGTCCA 2220 
GCGTTCCCCC TTCAAGAGAA ACACTCATCC CGAACAGCCT AAAAAATTCC CATCCCTTCT 2280 
TTCTCACCCC TCCATATCTA TATCTCCCGA GTGGCTGGAC AAAATGAGCT ACGTCTGGGT 2340 
GCAGTAGTTA TAGGTGGGGC AAGAGGTGGA TGCCCACTTT CTGGTCAGAC ACCTTTAGGT 2400 
TGCTCTGGGG AAGGCTGTCT TGCTAAATAC CTCCAGGGTT CCCAGCAAGT GGCCACCAGG 2460 
CCTTGTACAG GAAGACATTC AGTC ACCG TG TAATTAGTAA CACAGAAAGT CTGCCTGTCT 2520 
GCATTGTACA TAGTGTTTAT AATATTGTAA TAATATATTT TACCTGTGGT ATGTGGGCAT 2580 
GTTTACTGCC ACTGGCCTAG AGG AGACACA GACCTGGAGA CCGTTTTAAT GGGGGTTTTT 2640 
GCCTCTGTGC CTGTTCAAGA GACTTGCAGG GCTAGGTAGA GGGCCTTTGG GATGTTAAGG 2700 
TGACTGCAGC TGATGCCAAG ATGGACTCTG CAATGGGCAT ACCTGGGGGC TCGTTCCCTG 2760 
TCCCCAGAGG AAGCCCCCTC TCCTTCTCCA TGGGCATGAC TCTCCTTCGA GGCCACCACG 2820 
TTTATCTCAC AATGATGTGT TTTGCCTGAC TTTCCCTTTG CGCTGTCTCG TGGGAAAGGT 2880 
CATTCTGTCT G AG ACCCCAG CTCCTTCTCC AGCTTTGGCT GCGGGCATGG CCTGAGCTTT 2940 
CTGGAGAGCC TCTGCAGGGG GTTTGCCATC AGGGCCCTGT GGCTGGGTCT GCTGCAGAGC 3000 
TCCTTGGCTA TCAGGAGAAT CCTGGACACT GTACTGTGCC TCCCAGTTTA CAAACACGCC 3060 
CTTCATCTCA AGTGGCCCTT TAAAAGGCCT GCTGCCATGT GAGAGCTGTG AACAGCTCAG 3120 
CTCTGAGTCG GCAGACTGGG GCTTCCTCCT GGGCCACCAG ATGGAAAGGG GGTATTGTTT 3180 
GCCTCACTCC TGGATGCTGC GTTTTAAGGA AGTGAGTGAG AAAGAATGTG CCAAGATACC 3240 
TGGCTCCTGT GAAACCAGCC TCAGGAGGGA AACTGGGAGA GAGAAGCTGT GGTCTCCTGC 3300 
TACATGCCCT GGGAGCTGGA AGAGAAAAAC ACTCCCCTAA ACAATCGCAA AATGATGAAC 3360 
CATC A TGGGC CACTGTTCTC TTTGAGGGGA CAGGTTTAGG GGTTTGCGTT CGCCCTTGTG 3420 
GGCTG A AGCA CTAGCTTTTT GGTAGCTAGA CAC ATCCTGC ACCCAAAGGT TCTCTACAAA 3480 
GGCCCAGATT TGTTTGTAAA GCACTTTGAC TCTTACCTGG AGGCCCGCTC TCTAAGGGCT 3540 
TCCTGCGCTC CCACCTCATC TGTCCCTGAG ATGCAGAGCA GGATGGAGGG TCTGCTTCTA 3600 
GCTCAGCTGT TTCTCCTTGA GGTTG CGG AG GAATTGAATT GAATGGGACA GAGGGCAGGT 3660 
GCTGTGGCCA AGAAGATCTC CGAGCAGCAG TGACGGGGCA CCTTGCTGTG TGTCCTCTGG 3720 
GCATGTTAAC CCTTCTGTGG GGCCAAAGGT TTGCATCGTG GATCCAGCTG TGCTCCAGTC 3780 
TGTCCCCTCC TCCTCCACTC TGACTGCCAC GCCCCGGACC AGCAGCTTGG GGACCCTCCA 3840 
GGGTACTAAT GGGGCTCTGT TCTGAGATGG ACAAATTCAG TGTTGGAAAT ACATGTTGTA 3900 
CTATGCACTT CCCATGCTCC TAGGGTTAGG AATAGTTTCA AACATGATTG GCAGACATAA 3960 
CAACGGCAAA TACTCGGACT GGGGCATAGG ACTCCAG AGT AGGAAAAAGA CAAAAGATTT 4020 
GGCAGCCTGA CACAGGCAAC CTACCCCTCT CTCTCCAGCC TC TTTATG A A ACTGTTTGTT 4080 
TGCCAGTCCT GCCCTAAGGC AGAAGATGAA TTGAAGATGC TGTGCATGTT TCCTAAGTCC 4140 
TTGAGCAATC ATGGTGGTGA CAATTGCCAC A AGGG AT ATG AGGCCAGTGC CACCAGAGGG 4200 



302 



TGGTGCCAAG TGCCACATCC CTTCCGATCC ATTCCCCTCT GTATCCTCGG AGCACCCCAG 4260 
TTTGCCTTTG ATGTGTCCGC TGTGTATGTT AGCTGAACTT TGATGAGCAA AATTTCCTGA 4320 
GCGAAACACT CCAAAG AG AT AGGAAAACTT GCCGCCTCTT CTTTTTTGTC CCTTAA TCAA 4380 
ACTCAAATAA GCTTAAAAAA AATCCATGGA AGATCATGGA CATGTGAAAT G AGCATTTTT 4440 
TTCTTTTCTT TTTTTTTTTT TTTTTTTAAC AAAGTCTGAA CTGAACAGAA CAAGACTTTT 4500 
TCCTCATACA TCTCCAAATT GTTTAAACTT ACTTTATGAG TGTTTGTTTA GAAGTTCGGA 4560 
CCAACAGAAA AATGCAGTCA GATGTCATCT TGGAATTGGT TTCTAAAAGA GTAAGGC ATG 4620 
TCCCTGCCCA GAAACTTAGG AAGCATGAAA TAAATCAAAT GTTTATTTTC CTTCTTATTT 4680 
AAAATCATGC TAATGCAACA GAAATAGAGG GTTTGTGCCA AATGCTATGA ACGGCCCTTT 4740 
CTTAAAGACA AGCAAGGGAG ATTGATATAT GTACAATTTG CTCTCATGTT TTT 



SEQ ID N0:2 BCU4 Protein sequence: 
Protein Accession #' NP_07919t1 

1 11 21 31 41 51 
I i I I I I 

MSQESDNNKR LVALVPMPSD PPFNTRRAYT SEDEAWKSYL ENPLTAATKA MMIINGDEDS 60 
AAALGLLYDY ykvprdkrll svskasdsqe DQEKRNCLGT SEAQSNLSGG ENRVQVLKTV 120 
PVNLSLNQDH LENSKREQYS ISFPESSAI1 PVSGITVVKA EDFTPVFMAP PVHYPRGDGE 180 
EQRVVIFEQT QYDVPSLATH SAYLKDDQRS TPDSTYSESF KDAATEKFRS ASVGAEEYMY 240 
DQTSSGTFQY TLEATKSLRQ KQGEGPMTYL NKGQFYAITL SETGDNKCFR HPISKVRSVV 300 
MVVFSEDKNR DEQLKYWKYW HSRQHTAKQR VLDIADYKES FNTIGNIEEI AYNAVSFTWD 360 
VNEEAKIFIT VNCLSTDFSS QKGVKGLPLM IQIDTYSYNN RSNKPIHRAY CQDCVFCDKG 420 
AERKIRDEEQ KQNRKNGKGQ ASQTQCNSSS DGKLAAIPLQ KKSDITYFKT MPDLHSQPVL 480 
FIPDVHFANL QRTGQVYYNT DDEREGGS VL VKRMFRPMEE EFGPVPSKQM KEEGTKRVLL 540 
YVRKETDDVF DALMLKSPTV MGLMEAISEK YGLPVEKIAK LYKKSKKG1L VNMDDNIIEH 600 
YSNEDTFILN MESMVEGFKV TLMEI 

SEQ ID N0:3 8CU7 DNA SEQUENCE VARIANT 1: 

Nucfeic Acid Accession #: AA428062 

Coding sequence: 1 -777 (entire sequence represents open reading frame) 

1 11 21 31 41 51 

I I i I l I 

ATGATAGCAA TCTCTGCCGT CAGCAGTGCA CTCCTGTTCT CCCTTCTCTG TGAAGCAAGT 60 

ACCGTCGTCC TACTCAATTC CACTGACTCA TCCCCGCCAA CCAATAATTT CACTGATATT 120 

GAAGCAGCTC TGAAAGCACA ATTAGATTCA GCGGATATCC CCAAAGCCAG GCGGAAGCGC 180 

TACATTTCGC AGAATGACAT GATCGCCATT CTTGATTATC ATAATCAAGT TCGGGGCAAA 240 

GTGTTCCCAC CGGCAGCAAA TATGGAATAT ATGGTTTGGG ATGAAAATCT TGCAAAATCG 300 

GCAGAGGCTT GGGCGGC TAC TTGCATTTGG GACCATGGAC CTTCTTACTT AC TGAGATTT 360 

TTGGGCCAAA ATCTATC TGT ACGCACTGGA AGATATCGCT CTATTCTCCA GTTGGTCAAG 420 

CCATGGTATG ATGAAGTGAA AGATTATGCT TTTCCATATC CCCAGGATTG CAACCCCAGA 480 

TGTCCTATGA GATGTTTTGG TCCCATGTGC ACACATTATA CGCAGATGGT TTGGGCCACT 540 

TCCAATCGGA TAGGATGCGC AATTCATGCT TGC C AAAAC A TGAATGTTTG GGGATCTGTG 600 

TGGCGACGTG CAGTTTACTT GGTATGCAAC TATGCCCCAA AGGGCAATTG GATTGGAGAA 660 

GCACCATATA AAGTAGGGGT ACCATGTTCA TCTTGTCCTC CAAGTTATGG GGGATCTTGT 720 
ACTGACAATC TGTGTTTTCC AGGAGTTACG TCAAACTACC TGTACTGGTT TAA ATAA 

SEQ ID N0:4 BCU7 DNA SEQUENCE VARIANT 2: 

Nucleic Acid Accession #: AA428062 

Coding sequence: 1-777 (entire sequence represents open reading frame) 



1 11 21 31 41 51 

I I i ! 1 I 

ATGA TAGCAA TCTCTGCCGT CAGCAGTGCA CTCCTGTTCT CCCTTCTCTG TGAAGCAAGT 60 

ACCGTCGTCC TACTCAATTC CACTGACTCA TCCCCGCCAA CCAATAATTT CACTGATATT 120 

GAAGCAGCTC TGAAAGCACA ATTAGATTCA GCGGATATCC CCAAAGCCAG GCGGAAGCGC 180 

TACATTTCGC AGAATGACAT GATCGCCATT CTTGATTATC ATAATCAAGT TCGGGGCAAA 240 

GTGTTCCCAC CGGCAGCAAA TATGGAATAT ATGGTTTGGG ATGAAAATCT TGCAAAATCG 300 

GCAGAGGCTT GGGCGGCTAC TTGCATTTGG GACCATGGAC CTTCTTACTT ACTGAGATTT 360 

TTGGGCCAAA ATCTATC TGT ACGCACTGGA AGATATCGCT CTATTCTCCA GTTGGTCAAG 420 

CCATGGTATG ATGAAGTGAA AGATTATGCT TTTCCATATC CCCAGGATTG CAACCCCAGA 480 

TGTCCTATGA GATGTTTTGG TCCCATGTGC ACACATTATA CGCAGATGGT TTGGGCCACT 540 

TCCAATCGGA TAGGATGCGC AATTCATACT TGCCAAAACA TGAATGTTTG GGGATCTGTG 600 

TGGCGACGTG CAGTTTACTT GGTATGCAAC TATGCCCCAA AGGGCAATTG GATTGGAGAA 660 

GCACCATATA AAGTAGGGGT ACCATGTTCA TCTTGTCCTC CAAGTTATGG GGGATCTTGT 720 
ACTGACAATC TGTGTTTTCC AGGAGTTACG TCAAACTACC TGTACTGGTT TAAATAA 

SEQ ID N0:5 BCU7 Prolan sequence Variant 1: 
Protein Accession #: none 

1 11 21 31 41 51 

I I I ! I I 

MIAISAVSSA LLFSLLCEAS TWLLNSTDS SPPTNNFTDI EAALKAQLDS ADIPKARRKR 60 
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YISQNDMIAI LDYHNQVRGK VFPPAANMEY MVWDENLAKS AEAWAATCIW DHGPSYLLRF 120 

LGQNLSVRTG RYRSILQLVK PWYDEVKDYA FPYPQDCNPR CPMRCFGPMC THYTQMVWAT 180 

SNRIGCAIHA CQNMNVWGSV WRRAVY LVCN YAPKGNWIGE APYKVGVPCS SCPPSYGGSC 240 
TDNLCF PGVT SNYLYWFK 

SEQ ID N0:6 BCU7 Protein sequence Variant 2: 
Protein Accession #: none 



10 l 11 21 

I i I 

MIAISAVSSA LLFSLLCEAS TWLLNSTDS 
YISQNDMIAI LDYHNQVRGK VFPPAANMEY 
c LGQNLSVRTG RYRSILQLVK PWYDEVKDYA 

15 SNRIGCAIHT CQNMNVWGSV WRRAVYLVCN 
TDNLCF PGVT SNYLYWFK 



Nucleic Acid Accession #: NM_003014 
20 Coding sequence; 238-1276 (ur 

1 11 21 31 41 51 

I I I I i I 
GGCGGGTTCG CGCCCCGAAG GCTGAGAGCT GGCGCTGCTC GTGCCCTGTG TGCCAGACGG 60 
2p CGGAGCTCCG CGGCCGGACC CCGCGGCCCC GCTTTGCTGC CGACTGGAGT TTGGGGGAAG 120 
C AAACTCTCCT GCGCCCCAG A AGATTTCTTC CTCGGCG AAG GGACAGCGAA AGATGAGGGT 1 80 

. ™. GGCAGGAAGA GAAGGCGCTT TCTGTCTGCC GGGGTCGCAG CGCGAGAGGG CAGTGCCATG 240 

«»' TTCCTCTCCA TCCTAGTGGC GCTGTGCCTG TGGCTGCACC TGGCGCTGGG CGTGCGCGGC 300 

, f I GCGCCCTGCG AGGCGGTGCG C ATCCCTATG TGCCGGCACA TGCCCTGGAA CATCACGCGG 360 

30 ATGCCCAACC ACCTGCACCA CAGCACGCAG GAGAACGCCA TCCTGGCCAT CGAGCAGTAC 420 
* '~i GAGG AGCTGG TGG ACGTGAA CTGCAGCGCC GTGCTGCGCT TCTTCTTCTG TGCCATGTAC 480 

f ft GCGCCCATTT GCACCCTGGA GTTCCTGCAC GACCCTATCA AGCCGTGCAA GTCGGTGTGC 540 

I'..'. CAACGCGCGC GCGACGACTG CGAGCCCCTC ATGAAGATGT ACAACCACAG CTGGCCCGAA 600 

13 AGCCTGGCCT GCGACGAGCT GCCTGTCTAT GACCGTGGCG TGTGCATTTC GCCTGAAGCC 660 

35 ATCGTCACGG ACCTCCCGGA GGATGTTAAG TGGATAGACA TCACACCAGA CATGATGGTA 720 
4 "J CAGGAAAGGC CTCTTGATGT TGACTGTAAA CGCCTAAGCC CCGATCGGTG CAAGTGTAAA 780 

I L- AAGGTGAAGC CAACTTTGGC AACGTATCTC AGCAAAAACT ACAGCTATGT TATTCATGCC 840 

AAAATAAAAG CTGTGCAGAG GAGTGGCTGC AATGAGGTCA CAACGGTGGT GGATGTAAAA 900 
s GAGATCTTCA AGTCCTCATC ACCCATCCCT CGAACTCAAG TCCCGCTCAT TACAAATTCT 960 

AO TCTTGCCAGT GTCCACACAT CCTGCCCCAT CAAGATGTTC TCATCATGTG TTACGAGTGG 1020 

U CGTTCAAGGA TGATGCTTCT TGAAAATTGC TTAGTTGAAA AATGGAGAGA TCAGCTTAGT 1080 

%J AAAAG ATCCA TAC AGTGGG A AGAG AGGCTG CAGG A ACAGC GG AG AACAGT TCAGGAC A AG 1 1 40 

; „ AAGAAAACAG CCGGGCGCAC CAGTCGTAGT AATCCCCCCA AACCAAAGGG AAAGCCTCCT 1200 

\" GCTCCCAAAC CAGCCAGTCC CAAGAAGAAC ATTAAAACTA GGAGTGCCCA GAAGAGAACA 1260 

'45 AACCCGAAAA GAGTG TGA GC TAACTAGTTT CCAAAGCGG A GACTTCCGAC TTCCTT ACAG 1 320 

■ - ? GATG AGGCTG GGCATTGCCT GGG ACAGCCT ATGTAAGGCC ATGTGCCCCT TGCCCTAACA 1 380 

! ^ ACTCACTGCA GTGCTCTTCA TAGACACATC TTGCAGCATT TTTCTTAAGG CTATGCTTCA 1440 

% « GTTTTTCTTT GTAAGCCATC AC A AGCC ATA GTGGTAGGTT TGCCCTTTGG TACAG AAGGT 1 500 

GAGTTA A AGC TGGTGG A A A A GGCTTATTGC ATTGC ATTCA GAGTAACCTG TGTGCATACT 1560 
50 CTAGAAGAGT AGGGAAAATA ATGCTTGTTA CAATTCGACC TAATATGTGC ATTGTAAAAT 1620 
AAATGCCATA TTTCAAACAA AACACGTAAT TTTTTTACAG TATGTTTTAT TACCTTTTGA 1680 
TATCTGTTGT TGCAATGTTA GTGATGTTTT AAAATGTGAT GAAAATATAA TGTTTTTAAG 1740 
AAGGAACAGT AGTGGAATGA ATGTTAAAAG ATCTTTATGT GTTTATGGTC TGCAGAAGGA 1800 
TTTTTGTGAT GAAAGGGGAT TTTTTGAAAA ATTAGAGAAG TAGCATATGG AAAATTATAA 1860 
55 TGTGTTTTTT TACCAATGAC TTCAGTTTCT GTTTTTAGCT AGAAACTTAA AAACAAAAAT 1920 

AATAATAAAG AAAAATAAAT AAAAAGGAGA GGCAGACAAT GTCTGGATTC CTGTTTTTTG 1980 
GTTACCTGAT TTCCATGATC ATGATGCTTC TTGTCAACAC CCTCTTAAGC AGCACCAGAA 2040 
ACAGTGAGTT TGTCTGTACC ATTAGGAGTT AGGTACTAAT TAGTTGGCTA ATGCTCAAGT 2100 
ATTTTATACC CACAAGAGAG GTATGTCACT CATCTTACTT CCCAGGACAT CCACCCTGAG 2160 
60 AATAATTTGA CAAGCTTAAA AATGGCCTTC ATGTGAGTGC CAAATTTTGT TTTTCTTCAT 2220 
TTAAATATTT TCTTTGCCTA AATACATGTG AGAGGAGTTA AATATAAATG TACAGAGAGG 2280 
AAAGTTG AGT TCCACCTCTG AAATG AGAAT TACTTG ACAG TTGGGATACT TTAATCAG A A 2340 
AAAAAGAACT TATTTGCAGC ATTTTATCAA CAAATTTCAT AATTGTGGAC AATTGGAGGC 2400 
ATTTATTTTA AAAAACAATT TTATTGGCCT TTTGCTAACA CAGTAAGCAT GTATTTTATA 2460 
65 AGGCATTCAA TAAATGCACA ACGCCCAAAG GAAATAAAAT CCTATCTAAT CCTACTCTCC 2520 

ACTACACAGA GGTAATCACT ATTAGTATTT TGGCATATTA TTCTCCAGGT GTTTGCTTAT 2580 
GCACTTATAA AATGATTTGA ACAAATAAAA CTAGGAACCT GTATACATGT GTTTCATAAC 2640 
CTGCCTCCTT TGCTTGGCCC TTTATTGAGA TAAGTTTTCC TGTCAAGAAA GCAGAAACCA 2700 
TCTCATTTCT AACAGCTGTG TTATATTCCA TAGTATG CAT TACTCAACAA ACTGTTGTGC 2760 
70 TATTGGATAC TTAGGTGGTT TCTTCACTGA CAATACTGAA TAAACATCTC ACCGGAATTC 



31 41 51 

I I I 

SPPTNNFTDI EAALKAQLDS ADIPKARRKR 60 

MVWDENLAKS AEAWAATCIW DHGPSYLLRF 120 

FPYPQDCNPR CPMRCFGPMC THYTQMVWAT 180 

YAPKGNWIGE APYKVGVPCS SCPPSYGGSC 240 



SEQ ID N0:7 BCX2 DNA SEQUENCE 
ned sequences correspond to start and stop codons) 



SEQ ID N0:8 BCX2 Protein sequence: 
75 Protein Accession*: NP_003005.1 

1 11 21 31 41 51 
1 I 1 I I i 

MFLSILVALC LWLHLALGVR GAPCEAVRIP MCRHMPWNIT RMPNHLHHST QENAILAIEQ 60 

304 



YEELVDVNCS A VLR FFFC AM YAPICTLEFL HDPIKPCKSV CQRARDDCEP LMKMYNHSWP 120 
ESLACDELPV YDRGVCISPE AIVTDLPEDV KWIDITPDMM VQERPLDVDC KRLSPDRCKC 180 
KKVKPTLATY LSKNYS Y VIH AKIKAVQRSG CNEVTTVVDV KEIFKSSSPI PRTQVPLITN 240 
SSCQCPHILP HQDVLIMCYE WRSRMMLLEN CLVEKWRDQL SKRSIQWEER LQEQRRTVQD 300 
KKKTAGRTSR SNPPKPKGKP PAPKPASPKK NIKTRSAQKR TNPKRV 

SEQ ID N0:9 CBK1 DNA SEQUENCE 

Nucleic Acid Accession #: NM_03239 1 

Coding sequence: 129-302 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I i I i I 

GTCCTTCCTC TCCTAGCCTA AGGCGTGCAA ACAGAGCGCC ACTGGGAGGC TGAAACCTTT 60 

AGGCCGATGC TTGCTTGCAA GGTCAGGCAA GCTGGATTCT GGTCCCCACC TTTGCAGAGA 120 

GAACAGCG AT G TTGTGCGCC CATTTCTCAG ATCAAGGACC GGCCCATCTT ACTACCTCCA 180 

AGAGTGCTTT TCTCTCTAAT AAGAAAACAT CTACTTTGAA ACATCTACTG GGCGAGACCA 240 

GGAGTGATGG CTCAGCCTGT AATTCTGGAA TTTCGGGAGG CCGAGGCAGG AAGATTCCTT 300 

GAGCACAGGA GTTCCAGACC AGCCTGGGCA ATGTAGCAAG ACGCTGTCTC TATTTATACA 360 
ATAAAATTTT TTTAAAAAAG G 



SEQ ID NO:10 CBK1 Protein sequence: 
Protein Accession #: NP J 1 5767 

1 11 21 31 41 51 

I I i I I I 

MLCAHFSDQG PAHLTTS KS A FLSNKKTSTL KHLLGETRSD GS ACNSGISG GRGRKIP 

SEQ ID N0:11 CHA1 DNA SEQUENCE 

Nucleic Acid Accession #: NM_0201 62 

Coding sequence: 96-854 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

TCCTTGGGTT CGGGTGAAAG CGCCTGGGGG TTCGTGGCCA TGATCCCCGA GCTGCTGGAG 60 

AACTGAAGGC GGACAGTCTC CTGCGAAACC AGGC AATG GC GGAGCTGGAG TTTGTTCAGA 12 0 

TCATCATCAT CGTGGTGGTG ATGATGGTGA TGGTGGTGGT GATCACGTGC CTGCTGAGCC 180 

ACTACAAGCT GTCTGCACGG TCCTTCATCA GCCGGCACAG CCAGGGGCGG AGGAGAGAAG 240 

ATGCCC TGTC CTCAGAAGGA TGCCTGTGGC CCTCGGAGAG CACAGTGTCA GGCAACGGAA 300 

TCCCAGAGCC GCAGGTCTAC GCCCCGCCTC GGCCCACCGA CCGCCTGGCC GTGCCGCCCT 360 

TCGCCCAGCG GGAGCGCTTC CACCGCTTCC AGCCCACCTA TCCGTACCTG CAGCACGAGA 420 

TCGACCTGCC ACCCACCATC TCGCTGTCAG ACGGGGAGGA GCCCCCACCC TACCAGGGCC 480 

CCTGCACCCT CCAGCTTCGG GACCCCGAGC AGCAGCTGGA ACTGAACCGG GAGTCGGTGC 540 

GCGCACCCCC AAACAGAACC ATCTTCGACA GTGACCTGAT GGATAGTGCC AGGCTGGGCG 600 

GCCCCTGCCC CCCCAGCAGT AACTCGGGCA TCAGCGCCAC GTGCTACGGC AGCGGCGGGC 660 

GCATGGAGGG GCCGCCGCCC AC CT AC AGCG AGGTCATCGG CCACTACCCG GGGTCCTCCT 720 

TCCAGCACCA GCAGAGCAGT GGGCCGCCCT CCTTGCTGGA GGGGACCCGG CTCCACCACA 780 

CACACATCGC GCCCCTAGAG AGCGCAGCCA TCTGGAGCAA AGAGAAGGAT AAACAGAAAG 840 

GACACCCTCT CTAGGGTCCC CAGGGGGGCC GGGCTGGGGC TGCGTAGGTG AAAAGGCAGA 900 

ACACTCCGCG CTTCTTAGAA GAGGAGTGAG AGGAAGGCGG GGGGCGCAGC AAC GCATCGT 960 

GTGGCCCTCC CCTCCCACCT CCCTGTGTAT AAATATTTAC ATGTGATGTC TGGTCTGAAT 1020 

GCACAAGCTA AGAGAGCTTG CAAAAAAAAA AAGAAAAAAG AAAAAAAAAA ACCACGTTTC 1080 

TTTGTTGAGC TGTGTCTTGA AGGCAAAAGA AAAAAAATTT CTACAGTAAA AAAAAAAAAA 1140 



SEQ ID N0:12 CHA1 Protein sequence: 
Protein Accession #: NP_064567 

1 11 21 31 41 51 

i I ! I I I 

MAELEFVQI I IIVWMMVMV WITCLLSHY KLSARSFISR H S QGRRRED A LSSEGCLWPS 60 
ESTVSGNGIP EPQVYAPPRP TDRLAVPPFA QRERFHRFQP TYPYLQHEID LPPTISLSDG 120 
EEPPPYQGPC TLQLRDPEQQ LELNRESVRA PPNRTIFDSD LMDSARLGGP CPPSSNSGIS 180 
ATCYG SGGRM EGPPPTYSEV IGHYPGSSFQ HQQSSGPFSL LEGTRLHHTH IAPLESAAIW 2 40 
SKEKDKQKGH PL 

SEQ ID N0:13 CJA5 DNA SEQUENCE 

Nucleic Acid Accession #: NM_012445 

Coding sequence: 276-1 271 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I i I i I 

305 



Attorney Docket No.: 018501-004200US 



GCACGAGGGA AGAGGGTGAT CCGACCCGGG GAAGGTCGC T GGGCAGGGCG AGTTGGGAAA 60 

GCGGCAGCCC CCGCCGCCCC CGCAGCCCCT TCTCCTCCTT TCTCCCACGT CCTATCTGCC 120 

TCTCGCTGGA GGCCAGGCCG TGCAGCATCG AAGACAGGAG GAACTGGAGC CTCATTGGCC 180 

GGCCCGGGGC GCCGGCCTCG GGCTTAAATA GGAGCTCCGG GCTCTGGCTG GGACCCGACC 240 

5 GCTGCCGGCC GCGCTCCCGC TGCTCCTGCC GGGT GATGG A AAACCCCAGC CCGGCCGCCG 300 

CCCTGGGCAA GGCCCTCTGC GCTCTCCTCC TGGCCACTCT CGGCGCCGCC GGCCAGCCTC 360 

TTGGGGGAGA GTCCATCTGT TCCGCCAGAG CCCCGGCCAA ATACAGCATC ACCTTCACGG 420 

GCAAGTGGAG CCAGACGGCC TTCCCCAAGC AGTACCCCCT GTTCCGCCCC CCTGCGCAGT 480 

GGTCTTCGCT GCTGGGGGCC GCGCATAGCT CCGAC TAC AG CATGTGGAGG AAGAACCAGT 540 

10 ACGTCAGTAA CGGGCTGCGC GACTTTGCGG AGCGCGGCGA GGCCTGGGCG CTGATGAAGG 600 

AGATCGAGGC GGCGGGGGAG GCGCTGCAGA GCGTGCACGC GGTGTTTTCG GCGCCCGCCG 660 

TCCCCAGCGG CACCGGGCAG ACGTCGGCGG AGCTGGAGGT GCAGCGCAGG CACTCGCTGG 720 

TCTCGTTTGT GGTGCGCATC GTGCCCAGCC CCGACTGGTT CGTGGGCGTG GACAGCCTGG 780 

ACCTGTGCGA CGGGGACCGT TGGCGGGAAC AGGCGGCGCT GGACCTGTAC CCCTACGACG 840 

15 CCGGGACGGA CAGCGGCTTC ACCTTCTCCT CCCCCAACTT CGCCACCATC CCGCAGGACA 900 

CGGTGACCGA GATAACGTCC TCCTCTCCCA GCCACCCGGC CAACTCCTTC TACTACCCGC 960 

GGCTGAAGGC CCTGCCTCCC ATCGCCAGGG TGACACTGGT GCGGCTGCGA CAGAGCCCCA 1020 

GGGCCTTCAT CCCTCCCGCC CCAGTCCTGC CCAGCAGGGA CAATGAGATT GTAGACAGCG 108 0 

CCTCAGTTCC AGAAACGCCG CTGGACTGCG AGGTCTCCCT GTGGTCGTCC TGGGGACTGT 1140 

20 GCGGAGGCCA CTGTGGGAGG CTCGGGACCA AGAGCAGGAC TCGCTACGTC CGGGTCCAGC 12 00 

CCGCCAACAA CGGGAGCCCC TGCCCCGAGC TCGAAGAAGA GGCTGAGTGC GTCCCTGATA 1260 

ACTGCGTC TA A GACCAGAGC CCCGCAGCCC CTGGGGCCCC CGGAGCCATG GGGTGTC GGG 1320 

GGCTC CTGTG CAGGCTCATG CTGCAGGCGG CCGAGGCACA GGGGGTTTCG CGCTGCTCCT 1380 

GACCGCGGTG AGGCCGCGCC GACCATCTCT GCACTGAAGG GCCCTCTGGT GGCCGGCACG 1440 

25 GGCATTGGGA AACAGCCTCC TCCTTTCCCA ACCTTGCTTC TTAGGGGCCC CCGTGTCCCG 1500 

i?^ TCTGCTCTCA GCCTCCTCCT CCTGCAGGAT AAAGTCATCC CCAAGGCTCC AGCTACTCTA 1560 

~!! AATTATGGTC TCCTTATAAG TTATTGCTGC TCCAGGAGAT TGTCCTTCAT CGTCCAGGGG 162 0 

iU CCTGGCTCCC ACGTGGTTGC AGATACCTCA GACCTGGTGC TCTAGGCTGT GCTGAGCCCA 1680 

. CTCTCCCGAG GGCGCATCCA AGCGGGGGCC AC TTGAGAAG TGAATAAATG GGGCGGTTTC 1740 

'30 GGAAGCGTCA GTGTTTCCAT GTTATGGATC TCTCTGCGTT TGAATAAAGA CTATCTCTGT 1800 
TGCTCAC 



'.:3d SEQ ID NO: 14 CJA5 Protein sequence: 
'U: Protein Accession #: NP_036577 

1 11 21 31 41 51 

An I I i i I I 

fVJ MENPSPAAAL GKALCALLLA TLGAAGQPLG GESICSARAP AKYSITFTGK WSQTAFPKQY 60 

*™ PLFRPPAQWS SLLGAAHSSD YSMWRKNQYV SNGLRDFAER GEAWALMKEI EAAGEALQSV 120 

Q HAVFSAPAVP SGTGQTSAEL EVQRRHSLVS FWRIVPSPD WFVGVDSLDL CDGDRWREQA 180 

ALD LYPYDAG TDSGFTFSSP NF AT I PQDTV TEITSSSPSH PANSFYYPRL KALPPIARVT 240 

LVRLRQSPRA FIPPAPVLPS RDNEIVDSAS VPETPLDCEV SLWSSWGLCG GHCGRLGTKS 300 
RTRYVRVQPA NNGSPCPELE EEAECVPDNC V 



-45 



SEQ ID N0:15 LBH9 DNA SEQUENCE 

■' ~, Nucleic Acid Accession #: NM_002391 

50 Coding sequence: 26-457 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CGGGCGAAGC AGCGCGGGCA GCGAGATGCA GCACCGAGGC TTCCTCCTCC TCACCCTCCT 60 

55 CGCCCTGCTG GCGCTCACCT CCGCGGTCGC CAAAAAGAAA GATAAGGTGA AGAAGGGCGG 120 

CCCGGGGAGC GAGTGCGCTG AGTGGGCCTG GGGGCCCTGC ACCCCCAGCA GCAAGGATTG 180 

CGGCGTGGGT TTCCGCGAGG GC AC CTGCGG GGCCCAGACC CAGCGCATCC GGTGCAGGGT 240 

GCCCTGCAAC TGGAAGAAGG AGTTTGGAGC CGACTGCAAG TACAAGTTTG AGAACTGGGG 300 

TGCGTGTGAT GGGGGCACAG GC AC C AAAGT CCGCCAAGGC ACCCTGAAGA AGGCGCGCTA 360 

60 CAATGCTCAG TGCCAGGAGA CCATCCGCGT CACCAAGCCC TGCACCCCCA AGACCAAAGC 420 

AAAGGCCAAA GCCAAGAAAG GGAAGGGAAA GGACTAGACG CCAAGCCTGG ATGCCAAGGA 480 

GCCCCTGGTG TCACATGGGG CCTGGCCACG CCCTCCCTCT CCCAGGCCCG AGATGTGACC 540 

CACCAGTGCC TTCTGTCTGC TCGTTAGCTT TAATCAATCA TGCCCTGCCT TGTCCCTCTC 600 

ACTCCCCAGC CCCACCCCTA AGTGCCCAAA GTGGGGAGGG ACAAGGGATT CTGGGAAGCT 6 60 

65 TGAGCCTCCC CCAAAGCAAT GTGAGTCCCA GAGCCCGCTT TTGTTCTTCC CCACAATTCC 720 

ATTACTAAGA AACACATCAA ATAAACTGAC TTTTTCCCCC CAATAAAAGC TCTTCTTTTT 780 
TAATAT 



70 



SEQ ID NO: 16 LBH9 Protein sequence: 
Protein Accession #: NP_002382 



1 11 21 31 41 51 

ntz I 1 I I i I 

O MQHRGFLLLT LLALLALTSA VAKKKDKVKK GGPGSECAEW AWGPCTPSSK DCGVGFREGT 60 

CGAQTQRIRC RVPCNWKKEF GADCKYKFEN WGACDGGTGT KVRQGTLKKA RYNAQCQETI 120 
RVTKPCTPKT KAKAKAKKGK GKD 



306 



SEQ ID N0:17 LEM9 DNA SEQUENCE 

Nucleic Actd Accession #: NM_005244 

Coding sequence: 1 -1 61 7 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I i 1 

ATGGTAGAAC TAGTGATCTC ACCCAGCCTC ACTGTAAACA GCGATTGTCT GGATAAACTG 60 

AAGTTTAACC GTGCTGACGC TGCTGTGTGG ACTCTGAGTG ACAGACAAGG CATC AC C AAA 120 

TCGGCCCCCC TGAGAGTGTC CCAGCTCTTC TCCAGATCTT GCCCACGTGT CCTCCCCCGC 180 

CAGCCTTCCA CAGCCATGGC AGCCTACGGC CAGACGCAGT AC AGTGC GGG GATCCAGCAG 240 

GCTACCCCCT ATACAGCTTA CCCACCTCCA GCACAAGCCT ATGGAATCCC TTCCTACAGC 300 

ATCAAGACAG AAGACAGCTT GAACCATTCC CCTGGCCAGA GTGGATTCCT CAGCTATGGC 360 

TCCAGCTTCA GCACCTCACC CACTGGACAG AGCCCATACA CCTACCAGAT GCACGGCACA 420 

ACAGGGTTCT ATCAAGGAGG AAATGGACTG GGCAACGCAG CCGGTTTCGG GAGTGTGCAC 480 

CAGGACTATC CTTCCTACCC CGGCTTCCCC CAGAGCCAGT ACCCCCAGTA TTACGGCTCA 540 

TCCTACAACC CTCCCTACGT CCCGGCCAGC AGCATCTGCC CTTCGCCCCT CTCCACGTCC 600 

ACCTACGTCC TCCAGGAGGC ATCTCACAAC GTCCCCAACC AGAGTTCCGA GTCACTTGCT 660 

GGTGAATACA ACACACACAA TGGACCTTCC ACACCAGCGA AAGAGGGAGA CACAGACAGG 720 

CCGCACCGGG CCTCCGACGG GAAGCTCCGA GGCCGGTCTA AGAGGAGCAG TGACCCGTCC 780 

CCGGCAGGGG ACAATGAGAT TGAGCGTGTG TTCGTGTGGG ACTTGGATGA GACAATAATT 840 

ATTTTTCACT CCTTACTCAC GGGGACATTT GCATCCAGAT ACGGGAAGGA CACCACGACG 900 

TCCGTGCGCA TTGGCCTTAT GATGGAAGAG ATGATCTTCA ACCTTGCAGA TACACATCTG 960 

TTCTTCAATG ACCTGGAGGA TTGTGAC C AG ATCCACGTTG ATGACGTCTC ATCAGATGAC 1020 

AATGGCCAAG ATTTAAGCAC ATACAACTTC TCCGCTGACG GCTTCCACAG TTCGGCCCCA 1080 

GGAGCCAACC TGTGCCTGGG CTCTGGCGTG CACGGCGGCG TGGACTGGAT GAGGAAGCTG 1140 

GCCTTCCGCT ACCGGCGGGT GAAGGAGATG TACAATACCT ACAAGAACAA CGTTGGTGGG 1200 

TTGATAGGCA CTCCCAAAAG GGAGACCTGG CTACAGCTCC GAGCTGAGCT GGAAGCTCTC 1260 

AC AGACC TCT GGCTGACCCA CTCCCTGAAG GC AC TAAACC TCATCAACTC CCGGCCCAAC 1320 

TGTGTCAATG TGCTGGTCAC CACCACTCAA CTAATTCCTG CCCTGGCCAA AGTCCTGCTA 1380 

TATGGCCTGG GGTCTGTGTT TCCTATTGAG AACATCTACA GTGCAACCAA GACAGGGAAG 1440 

GAGAGCTGCT TCGAGAGGAT AATGCAGAGA TTCGGCAGAA AAGCTGTCTA CGTGGTGATC 1500 

GGTGATGGTG TGGAAGAGGA GCAAGGAGCG AAAAAGCACA ACATGCCTTT CTGGC GGATA 1560 
TCCTGCCACG CAGACCTGGA GGCACTGAGG CACGCCCTGG AACTGGAGTA TTTATAG 

SEQ ID MO:18 LEM9 Protein sequence: 
Protein Accession #: NPJXJ5235 



l n 21 31 41 51 

I I I 1 i i 

MVELVISPSL TVNSDCLDKL KFNRADAAVW TLSDRQGITK SAPLRVSQLF SRSCPRVLPR 60 

QPSTAMAAYG QTQYSAGIQQ ATPYTAYPPP AQAYGIPSYS IKTEDSLNHS PGQSGFLSYG 120 

SSFSTSPTGQ SPYTYQMHGT TGFYQGGNGL GNAAGFGSVH QDYPSYPGFP QSQYFQYYGS 180 

SYNPPYVPAS SICPSPLSTS TYVLQEASHN VPNQSSESLA GEYNTHNGPS TPAKEGDTDR 240 

FHRASDGKLR GRSKRSSDPS PAGDNEIERV FVWDLDETII IFHSLLTGTF ASRYGKDTTT 300 

SVRIGLMMEE MIFNLADTHL FFNDLEDCDQ IHVDDVSSDD NGQDLSTYNF SADGFHSSAP 360 

GANLCLGSGV HGGVDWMRKL AFRYRRVKEM YNTYKNNVGG L I GTFKRETW LQLRAELEAL 420 

TDLWLTHSLK ALNLINSRPN CVNVLVTTTQ LIPALAKVLL YGLGSVFPIE NIYSATKTGK 480 
ESCFERIMQR FGRKAVYWI GDGVEEEQGA KKHNMPFWRI SCHADLEALR HALELEYL 

SEQ ID N0:19 0AA1 DNA SEQUENCE 

Nucleic Acid Accession #: NM„00274O 

Coding sequence: 178-1 968 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

i I I i i i 

CCGCGGTTCC GGCTGCTCCG GC GAGGCG AC CCTTGGGTCG GCGCTGCGGG CGAGGTGGGC 60 

AGGTAGGTGG GCGGACGGCC GCGGTTCTCC GGCAAGCGCA GGCGGCGGAG TCCCCCACGG 120 

CGCCCGAAGC GCCCCCCGCA CCCCCGGCCT CCAGCGTTGA GGCGGGGGAG TGAGGA GATG 180 

CCGACCCAGA GGGACAGCAG CACCATGTCC CACACGGTCG CAGGCGGCGG CAGCGGGGAC 240 

CATTCCCACC AGGTCCGGGT GAAAGCCTAC TACCGCGGGG ATATCATGAT AACACATTTT 300 

GAACCTTCCA TCTCCTTTGA GGGCCTTTGC AATGAGGTTC GAGACATGTG TTCTTTTGAC 360 

AACGAACAGC TC TTCACC AT GAAATGGATA GATGAGGAAG GAGACCCGTG TACAGTATCA 420 

TCTCAGTTGG AGTTAGAAGA AGCCTTTAGA CTTTATGAGC TAAACAAGGA TTCTGAACTC 4 80 

TTGATTCATG TGTTCCCTTG TGTACCAGAA CGTCCTGGGA TGCCTTGTCC AGGAGAAGAT 540 

AAATCCATCT ACC GTAGAGG TGCACGCCGC TGGAGAAAGC TTTATTGTGC CAATGGCCAC 600 

AC TTTCC AAG CCAAGCGTTT CAACAGGCGT GCTCACTGTG CCATCTGCAC AGACCGAATA 660 

TGGGGACTTG GACGCCAAGG ATATAAGTGC ATCAACTGCA AACTCTTGGT TCATAAGAAG 720 

TGCCATAAAC TCGTCACAAT TGAATGTGGG CGGCATTCTT TGCCACAGGA ACCAGTGATG 780 

CCCATGGATC AGTCATCCAT GCATTC TG AC CATGCACAGA CAGTAATTCC ATATAATCCT 840 

TCAAGTCATG AGAGTTTGGA TCAAGTTGGT GAAGAAAAAG AGGCAATGAA CACCAGGGAA 900 

AGTGGCAAAG CTTCATCCAG TCTAGGTCTT CAGGATTTTG ATTTGCTCCG GGTAATAGGA 960 

AGAGGAAGTT ATGCCAAAGT ACTGTTGGTT CGATTAAAAA AAACAGATCG TATTTATGCA 1020 

ATGAAAGTTG TGAAAAAAGA GC TTGTTAAT GATGATGAGG ATATTGATTG GGTACAGACA 1080 

GAGAAGCATG TGTTTGAGCA GGCATCCAAT CATCCTTTCC TTGTTGGGCT GCATTCTTGC 1140 

TTTCAGACAG AAAGCAGATT GTTCTTTGTT ATAGAGTATG TAAATGGAGG AGACCTAATG 1200 

TTTCATATGC AGCGACAAAG AAAACTTCCT GAAGAACATG CCAGATTTTA CTCTGCAGAA 1260 



307 



ATCAGTCTAG CATTAAATTA TCTTCATGAG CGAGGGATAA TTTATAGAGA TTTGAAACTG 1320 

GACAATGTAT TACTGGACTC TGAAGGCCAC ATTAAACTCA CTGACTACGG CATGTGTAAG 1380 

GAAGGATTAC GGCCAGGAGA TACAACCAGC ACTTTCTGTG GTACTCCTAA TTACATTGCT 1440 

CCTGAAATTT TAAGAGGAGA AGATTATGGT TTCAGTGTTG ACTGGTGGGC TCTTGGAGTG 1500 

CTCATGTTTG AGATGATGGC AGGAAGGTCT CCATTTGATA TTGTTGGGAG CTCCGATAAC 1560 

CCTGACCAGA ACACAGAGGA TTATCTCTTC CAAGTTATTT TGGAAAAACA AATTCGCATA 1620 

CCACGTTCTC TGTCTGTAAA AGCTGCAAGT GTTCTGAAGA GTTTTCTTAA TAAGGACCCT 1680 

AAGGAACGAT TGGGTTGTCA TCCTCAAACA GGATTTGCTG ATATTCAGGG ACACCCGTTC 1740 

TTCCGAAATG TTGATTGGGA TATGATGGAG CAAAAACAGG TGGTACCTCC CTTTAAACCA 1800 

AATATTTCTG GGGAATTTGG TTTGGACAAC TTTGATTCTC AGTTTACTAA TGAACCTGTC 186 0 

CAGCTCACTC CAGATGACGA TGACATTGTG AGGAAGATTG ATCAGTCTGA ATTTGAAGGT 1920 

TTTGAGTATA TCAATCCTCT TTTGATGTCT GCAGAAGAAT GTGTCTGATC CTCATTTTTC 1980 

AACCATGTAT TCTACTCATG TTGCCATTTA ATGCATGGAT AAACTTGCTG CAAGCCTGGA 2040 

TACAATTAAC CATTTTATAT TTGCCACCTA CAAAAAAACA CCCAATATCT TCTCTTGTAG 2100 

ACTATATGAA TCAATTATTA CATCTGTTTT ACTATGAAAA AAAAATTAAT ACTACTAGCT 2160 

TCCAGACAAT CATGTCAAAA TTTAGTTGAA CTGGTTTTTC AGTTTTTAAA AGGCCTACAG 2220 
ATGAGTAATG AAGTT AC CTT TTTTGTTTAA AAAAAAAAAA G 



SEQ ID NO:20 OAA1 Protein sequence: 
Protein Accession #: NP_00273 1 

1 11 21 31 41 51 

I I I i i I 

MSHTVAGGGS GDHSHQVRVK AYYRGDIMIT HFEPSISFEG LCNEVRDMCS FDNEQLFTMK 60 

WIDEEGDPCT VSSQLELEEA FRLYELNKDS ELLIHVFPCV PERPGMPCPG EDK S I YRRGA 120 

RRWRKLYCAN GHTFQAKRFN RRAHCAICTD RIWGLGRQGY KCINCKLLVH KKCHKLVTIE 180 

CGRHSLPQEF VMPMDQSSMH SDHAQTVIPY NPSSHESLDQ VGEEKEAMNT RESGKASSSL 240 

GLQDFDLLRV IGRGSYAKVL LVRLKKTDRI YAMKWKKEL VNDDEDIDWV QTEKHVFEQA 300 

SNHPFLVGLH SCFQTESRLF FVIEYVNGGD LMFHMQRQRK LPEEHARFYS AEISLALNYL 360 

HERGIIYRDL KLDNVLLDSE GHIKLTDYGM CKEG LRPGDT TSTFCGTPNY IAPEILRGED 420 

YGFSVDWWAL GVLMFEMMAG RSPFDIVGSS DNPDQNTEDY LFQVILEKQI RIPRSLSVKA 480 

ASVLKSFLNK DPKERLGCHP QTGFADIQGH PFFRNVDWDM MEQKQWFPF KPNISGEFGL 540 
DNFDSQFTNE PVQLTPDDDD IVRKIDQSEF EGFEYINPLL MSAEECV 

SEQ ID N0:21 0BH2 DNA SEQUENCE 

Nucleic Acid Accession #: L05628 

Coding sequence; 197-4792 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I i I I I 

CCAGGCGGCG TTGCGGCCCC GGCCCCGGCT CCCTGCGCCG CCGCCGCCGC CGCCGCCGCC 60 

GCCGCCGCCG CCGCCGCCAG CGCTAGCGCC AGCAGCCGGG CCCGATCACC CGCCGCCCGG 120 

TGCCCGCCGC CGCCCGCGCC AGCAACCGGG CCCGATCACC CGCCGCCCGG TGCCCGCCGC 180 

CGCCCGCGCC ACCGGC ATG G CGCTCCGGGG CTTCTGCAGC GCCGATGGCT CCGACCCGCT 240 

CTGGGACTGG AATGTCACGT GGAATACCAG CAACCCCGAC TTCACCAAGT GCTTTCAGAA 300 

CACGGTCCTC GTGTGGGTGC CTTGTTTTTA CCTCTGGGCC TGTTTCCCCT TCTACTTCCT 360 

CTATCTCTCC CGACATGACC GAGGCTACAT TCAGATGACA CCTCTCAACA AAACCAAAAC 42 0 

TGCCTTGGGA TTTTTGCTGT GGATCGTCTG CTGGGCAGAC CTCTTCTACT CTTTCTGGGA 480 

AAGAAGTCGG GGCATATTCC TGGCCCCAGT GTTTCTGGTC AGCCCAACTC TCTTGGGCAT 540 

CACCACGCTG CTTGCTACCT TTTTAATTCA GCTGGAGAGG AGGAAGGGAG TTCAGTCTTC 600 

AGGGATCATG CTCACTTTCT GGCTGGTAGC CCTAGTGTGT GCCCTAGCCA TCCTGAGATC 66 0 

CAAAATTATG ACAGCCTTAA AAGAGGATGC CCAGGTGGAC CTGTTTCGTG ACATCACTTT 720 

CTACGTCTAC TTTTCCCTCT TACTCATTCA GCTCGTCTTG TCCTGTTTCT CAGATCGCTC 780 

ACCCCTGTTC TCGGAAACCA TCCACGACCC TAATCCCTGC CCAGAGTCCA GCGCTTCCTT 840 

CCTGTCGAGG ATCACCTTCT GGTGGATCAC AGGGTTGATT GTCCGGGGCT ACCGCCAGCC 900 

CCTGGAGGGC AGTGACC TCT GGTCCTTAAA CAAGGAGGAC ACGTCGGAAC AAGTCGTGCC 960 

TGTTTTGGTA AAGAACTGGA AGAAGGAATG CGCCAAGACT AGGAAGCAGC CGGTGAAGGT 1020 

TGTGTACTCC TCCAAGGATC CTGCCCAGCC GAAAGAGAGT TCCAAGGTGG ATGCGAATGA 1080 

GGAGGTGGAG GCTTTGATCG TCAAGTCCCC ACAGAAGGAG TGGAACCCCT CTCTGTTTAA 1140 

GGTGTTATAC AAGACCTTTG GGCCCTACTT CCTCATGAGC TTCTTCTTCA AGGCCATCCA 1200 

CGACCTGATG ATGTTTTCCG GGCCGCAGAT CTTAAAGTTG CTCATCAAGT TCGTGAATGA 1260 

CACGAAGGCC CCAGACTGGC AGGGCTACTT CTACACCGTG CTGCTGTTTG TCACTGCCTG 132 0 

CCTGCAGACC CTCGTGCTGC ACCAGTACTT CCACATCTGC TTCGTCAGTG GCATGAGGAT 1380 

CAAGACCGCT GTCATTGGGG CTGTCTATCG GAAGGCCCTG GTGATCACCA ATTCAGCCAG 1440 

AAAATCCTCC ACGGTCGGGG AGATTGTCAA CCTCATGTCT GTGGACGCTC AGAGGTTCAT 1500 

GGACTTGGCC ACGTACATTA AGATGATCTG GTCAGCCCCC CTGCAAGTCA TCCTTGCTCT 1560 

CTACCTCCTG TGGCTGAATC TGGGC CCTTC CGTCCTGGCT GGAGTGGCGG TGATGGTCCT 1620 

CATGGTGCCC GTCAATGCTG TGATGGCGAT GAAGACCAAG ACGTATCAGG TGGCCCACAT 1680 

GAAGAGCAAA GACAATCGGA TCAAGCTGAT GAACGAAATT CTCAATGGGA TCAAAGTGCT 1740 

AAAGCTTTAT GCCTGGGAGC TGGCATTCAA GGACAAGGTG CTGGCCATCA GGCAGGAGGA 1800 

GCTGAAGGTG CTGAAGAAGT CTGCCTACCT GTCAGCCGTG GGCACCTTCA CCTGGGTCTG 1860 

CACGCCCTTT CTGGTGGCCT TGTGCACATT TGCCGTCTAC GTGACCATTG ACGAGAACAA 1920 

CATCCTGGAT GCCCAGACAG CCTTCGTGTC TTTGGCCTTG TTCAACATCC TCCGGTTTCC 1980 

CCTGAACATT CTCCCCATGG TCATCAGCAG CATCGTGCAG GCGAGTGTCT CCCTCAAACG 2040 

CCTGAGGATC TTTCTCTCCC ATGAGGAGCT GGAACCTGAC AGCATCGAGC GACGGCCTGT 2100 

CAAAGACGGC GGGGGCACGA ACAGCATCAC CGTGAGGAAT GCCACATTCA CCTGGGCCAG 2160 

GAGCGACCCT CCCACACTGA ATGGCATCAC CTTCTCCATC CCCGAAGGTG CTTTGGTGGC 2220 
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CGTGGTGGGC CAGGTGGGCT GCGGAAAGTC GTCCCTGCTC TCAGCCCTCT TGGCTGAGAT 22 80 

GGACAAAGTG GAGGGGCACG TGGCTATCAA GGGCTCCGTG GCCTATGTGC CACAGCAGGC 2340 

CTGGATTCAG AATGATTCTC TCCGAGAAAA CATCCTTTTT GGATGTCAGC TGGAGGAACC 2400 

ATATTACAGG TCCGTGATAC AGGCCTGTGC CCTCCTCCCA GACCTGGAAA TCCTGCCCAG 2460 

TGGGGATCGG ACAGAGATTG GCGAGAAGGG CGTGAACCTG TCTGGGGGCC AGAAGCAGCG 2520 

CGTGAGCCTG GCCCGGGCCG TGTACTCCAA CGCTGACATT TACCTCTTCG ATGATCCCCT 2580 

CTCAGCAGTG GATGCCCATG TGGGAAAACA CATCTTTGAA AATGTGATTG GCCCCAAGGG 2640 

GATGCTGAAG AACAAGACGC GGATCTTGGT CACGCACAGC ATGAGCTACT TGCCGCAGGT 2700 

GGACGTCATC ATCGTCATGA GTGGCGGCAA GATCTCTGAG ATGGGCTCCT ACCAGGAGCT 276 0 

GCTGGCTCGA GACGGCGCCT TCGCTGAGTT CCTGCGTACC TATGCCAGCA CAGAGCAGGA 2320 

GCAGGATGCA GAGGAGAACG GGGTCACGGG CGTCAGCGGT CCAGGGAAGG AAGCAAAGCA 2880 

AATGGAGAAT GGCATGCTGG TGACGGACAG TGCAGGGAAG CAACTGCAGA GACAGCTCAG 2940 

CAGCTCCTCC TCCTATAGTG GGGACATCAG CAGGCACCAC AACAGCACCG CAGAACTGCA 3000 

GAAAGCTGAG GCCAAGAAGG AGGAGACCTG GAAGCTGATG GAGGCTGACA AGGCGCAGAC 3060 

AGGGCAGGTC AAGCTTTCCG TGTACTGGGA CTACATGAAG GCCATCGGAC TCTTCATCTC 3120 

CTTCCTCAGC ATCTTCCTTT TCATGTGTAA CCATGTGTCC GCGCTGGCTT C C AACTATTG 3180 

GCTCAGCCTC TGGACTGATG ACCCCATCGT CAACGGGACT CAGGAGCACA CGAAAGTCCG 3240 

GCTGAGCGTC TATGGAGCCC TGGGCATTTC ACAAGGGATC GCCGTGTTTG GCT ACTC CAT 3300 

GGCCGTGTCC ATCGGGGGGA TCTTGGCTTC CCGCTGTCTG CACGTGGACC TGCTGCACAG 3360 

CATCCTGCGG TCACCCATGA GCTTCTTTGA GCGGACCCCC AGTGGGAACC TGGTGAACCG 3420 

CTTCTCCAAG GAGCTGGACA CAGTGGACTC CATGATCCCG GAGGTCATCA AGATGTTCAT 3480 

GGGCTCCCTG TTCAACGTCA TTGGTGCCTG CATCGTTATC CTGCTGGCCA CGCCCATCGC 3540 

CGCCATCATC ATCCCGCCCC TTGGCCTCAT CTACTTCTTC GTCCAGAGGT TCTACGTGGC 3600 

TTCCTCCCGG CAGCTGAAGC GCCTCGAGTC GGTCAGCCGC TCCCCGGTCT ATTCCCATTT 3660 

CAACGAGACC TTGCTGGGGG TCAGCGTCAT TCGAGCCTTC GAGGAGCAGG AGCGCTTCAT 3720 

CCACCAGAGT GACCTGAAGG TGGACGAGAA CCAGAAGGCC TATTACCCCA GCATCGTGGC 3780 

CAACAGGTGG CTGGCCGTGC GGCTGGAGTG TGTGGGCAAC TGCATCGTTC TGTTTGCTGC 3840 

CCTGTTTGCG GTGATCTCCA GGCACAGCCT CAGTGCTGGC TTGGTGGGCC TCTCAGTGTC 3900 

TTACTCATTG CAGGTCACCA CGTACTTGAA CTGGCTGGTT CGGATGTCAT CTGAAATGGA 3960 

AACCAACATC GTGGCCGTGG AGAGGCTCAA GGAGTATTCA GAGACTGAGA AGGAGGCGCC 4020 

CTGGCAAATC CAGGAGACAG CTCCGCCCAG CAGCTGGCCC CAGGTGGGCC GAGTGGAATT 4080 

CCGGAACTAC TGCCTGCGCT ACCGAGAGGA CCTGGACTTC GTTCTCAGGC ACATCAATGT 4140 

CACGATCAAT GGGGGAGAAA AGGTCGGCAT CGTGGGGCGG ACGGGAGCTG GGAAGTCGTC 4200 

CCTGACCCTG GGCTTATTTC GGATCAACGA GTCTGCCGAA GGAGAGATCA TCATCGATGG 4260 

CATCAACATC GCCAAGATCG GCCTGCACGA CCTCCGCTTC AAGATCACCA TCATCCCCCA 4320 

GGACCCTGTT TTGTTTTCGG GTTCCCTCCG AATGAACCTG GAC CCATTC A GCCAGTACTC 4380 

GGATGAAGAA GTCTGGACGT CCCTGGAGCT GGCCCACCTG AAGGACTTCG TGTCAGCCCT 4440 

TCCTGACAAG CTAGACCATG AATGTGCAGA AGGC GGGGAG AACCTCAGTG TCGGGCAGCG 4500 

CCAGCTTGTG TGCCTAGCCC GGGCCCTGCT GAGGAAGACG AAGATCCTTG TGTTGGATGA 45 60 

GGCCACGGCA GCCGTGGACC TGGAAACGGA CGACCTCATC CAGTCCACCA TCCGGACACA 4620 

GTTCGAGGAC TGCACCGTCC TCACCATCGC CCACCGGCTC AAC AC CATC A TGGAC TAC AC 4680 

AAGGGTGATC GTCTTGGACA AAGGAGAAAT CCAGGAGTAC GGCGCCCCAT CGGACCTCCT 4740 

GCAGCAGAGA GGTC TTTTCT ACAGCATGGC CAAAGACGCC GGCTTGGT GT GA GCCCCAGA 4800 

GCTGGCATAT CTGGTCAGAA CTGCAGGGCC TATATGCCAG CGCCCAGGGA GGAGTCAGTA 4860 

CCCCTGGTAA ACCAAGCCTC CCACACTGAA ACCAAAACAT AAAAACCAAA CCCAGACAAC 4920 

CAAAACATAT TCAAAGCAGC AGCCACCGCC ATCCGGTCCC CTGCCTGGAA CTGGCTGTGA 4980 
AGACCCAGGA GAGACAGAGA TGCGAAC C AC C 



SEQ ID NO:22 OBH2 Protein sequence: 
Protein Accession #: AAB4661 6 

1 11 21 31 41 51 

I I I I I ! 

MALRGFCSAD GSDPLWDWNV TWNTSNPDFT KCFQNTVLVW VPCFYLWACF PFYFLYLSRH 60 

DRGYIQMTPL NKTKTALGFL LWIVCWADLF YSFWERSRGI FLAPVFLVSP TLLGITTLLA 120 

TFLIQLERRK GVQSSGIMLT FWLVALVCAL AILRSKIMTA LKEDAQVDLF RDITFYVYFS 180 

LLLIQLVLSC FSDRSPLFSE TIHDPNPCPE SSASFLSRIT FWWITGLIVR GYRQPLEGSD 240 

LWSLNKEDTS EQWPVLVKN WKKECAKTRK QPVKWYSSK DPAQPKESSK VDANEEVEAL 300 

IVKSPQKEWN PSLFKVLYKT FGPYFLMSFF FKAIHDLMMF SGPQILKLLI KFVNDTKAPD 360 

WQGYFYTVLL FVTACLQTLV LHQYFHICFV SGMRIKTAVI GAVYRKALVI TNSARKSSTV 420 

GEIWLMSVD AQRFMD LATY INMIWSAPLQ VI LALYLLWL NLG PSVLAGV AVMVLMVPVN 480 

AVMAMKTKTY QVAHMKSKDN RIKLMNEILN GIKVLKLYAW ELAFKDKVLA IRQEELKVLK 540 

KSAYLSAVGT FTWVCTPFLV ALCTFAVYVT IDENNILDAQ TAFVSLALFN ILRFPLNILP 600 

MVISSIVQAS VSLKRLRIFL SHEELEPDSI ERRPVKDGGG TNS I TVRNAT FTWARSDPPT 660 

LNGITFSIPE GALVAWGQV GCGKSSLLSA LLAEMDKVEG HVAI KGSVAY VPQQAWIQND 720 

SLRENILFGC QLEEPYYRSV IGACALLPDL EILPSGDRTE IGEKGVNLSG GQKQRVSLAR 780 

AVYSNADIYL FDDPLSAVDA HVGKHIFENV IG PKGMLKNK TRILVTHSMS YLPQVDVIIV 840 

MSGGKISEMG SYQELLARDG AFAEFLRTYA STEQEQDAEE NGVTGVSGPG KEAKQMENGM 900 

LVTDSAGKQL QRQLSSSSSY SGDISRHHNS TAELQKAEAK KEETWKLMEA DKAOTGQVKL 960 

SVYWDYMKAI GLFISFLSIF LFMCNHVSAL ASNYWLSLWT DDPIVNGTQE HTKVRLSVYG 1020 

ALGISQGIAV FGYSMAVSIG GILASRCLHV DLLHSILRSP MSFFERTPSG NLVNRFSKEL 1080 

DTVDSMIPEV IKMFMGSLFN VIGACIVILL ATPIAAIIIP PLGLIYFFVQ RFYVASSRQL 1140 

KRLESVSRSP VYSHFNETLL GVSVIRAFEE QERFIHQSDL KVDENQKAYY PSIVANRWLA 1200 

VRLECVGNCI VLFAALFAVI SRHSLSAGLV GLSVSYSLQV TTYLNWLVRM SSEMETNIVA 1260 

VERLKEYSET EKEAPWQIQE TAPPSSWPQV GRVEFRNYCL RYREDLDFVL RHINVTINGG 1320 

EKVGIVGRTG AGKSSLTLGL FRINESAEGE IIIDGINIAK IGLHDLRFKI TIIPQDPVLF 1380 

SGSLRMNLDP FSGYSDEEVW TSLELAHLKD FVSALPDKLD HECAEGGENL SVGQRQLVCL 1440 

ARALLRKTKI LVLDEATAAV DLETDDLIQS TIRTQFEDCT VLT I AHRLNT I MDYTRVI VL 1500 
DKGEIQEYGA PSDLLQQRGL FYSMAKDAGL V 
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Attorney Docket No.: 018501-004200US 



SEQ ID NO:23 PAA2 DNA SEQUENCE 

Nucleic Acid Accession #: NM J)1 3309 
5 Coding sequence: 1-1290 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I 1 i ! I i 

ATGGCCGGCT CTGGCGCGTG GAAGCGCCTC AAATCTATGC TAAGGAAGGA TGATGCGCCG 60 

10 CTGTTTTTAA ATGACACCAG CGCCTTTGAC TTCTCGGATG AGGCGGGGGA CGAGGGGCTT 120 

TCTCGGTTCA ACAAACTTCG AGTTGTGGTG GCCGATGACG GTTCCGAAGC CCCGGAAAGG 180 

CCTGTTAACG GGGCGCACCC GACCCTCCAG GCCGACGATG ATTCCTTACT GGACCAAGAC 240 

TTAC CTTTGA CCAACAGTCA GCTGAGTTTG AAGGTGGACT CCTGTGACAA CTGCAGCAAA 300 

CAGAGAGAGA TACTGAAGCA GAGAAAGGTG AAAGCCAGGT TGACCATTGC TGCCGTTCTG 360 

15 TACTTGCTTT TCATGATTGG AGAACTTGTA GGTGGATACA TTGCAAATAG CCTAGCAATC 420 

ATGACAGATG CACTTCATAT GTTAACTGAC CTAAGCGCCA TCATACTCAC CCTGCTTGCT 480 

TTGTGGCTAT CATCAAAATC ACCAACCAAA AGATTCACCT TTGGATTTCA TCGCTTAGAG 540 

GTTTTGTCAG CTATGATTAG TGTGCTGTTG GTGTATATAC TTATGGGATT CCTCTTATAT 600 

GAAGCTGTGC AAAGAACTAT CCATATGAAC TATGAAATAA ATGGAGATAT AATGCTCATC 660 

20 ACCGCAGCTG TTGGAGTTGC AGTTAATGTA ATAATGGGGT TTCTGTTGAA CCAGTCTGGT 720 

CACCGTCACT CCCATTCCCA CTCCCTGCCT TCAAATTCCC CTACCAGAGG TTCTGGGTGT 7 80 

GAACGTAACC ATGGGCAGGA TAGCCTGGCA GTGAGAGCTG CATTTGTACA TGC TTTGGGA 840 

GATTTGGTAC AGAGTGTTGG TGTGCTAATA GCTGCATACA TCATACGATT CAAGCCAGAA 900 

TACAAGATTG CTGATCCCAT CTGTACATAC GTATTTTCAT TACTTGTGGC TTTTACAACA 960 

■35 TTTCGAATCA TATGGGATAC AGTAGTTATA ATACTAGAAG GTGTGCCAAG CCATTTGAAT 1020 

^ GTAGACTATA TCAAAGAAGC CTTGATGAAA ATAGAAGATG TATATTCAGT CGAAGATTTA 1080 

AATATCTGGT CTCTCACTTC AGGAAAATCT ACTGCCATAG TTCACATACA GCTAATTCCT 1140 

; r GGAAGTTCAT CTAAATGGGA GGAAGTACAG TCCAAAGCAA ACCATTTATT ATTGAACACA 12 00 

«y TTTGGCATGT ATAGATGTAC TATTCAGCTT CAGAGTTACA GGCAAGAAGT GGACAGAACT 1260 

^30 TGTGCAAATT GTCAGAGTTC TAGTCCCTGA 



SEQ ID NO:24 PAA2 Protein sequence: 
S5 Protein Accession #: NP_037441 

i 11 21 31 41 51 

* I I I I I I 

.= ; MAG SGAWKRL KSMLRKDDAP LFLNDTSAFD FSDEAGDEGL SRFNKLRVW ADDGSEAPER 60 

=40 PVNGAHPTLQ ADDDSLLDQD LPLTNSQLSL KVDSCDNCSK QREILKQRKV KARLTIAAVL 120 

lZ\ YLLFMIGELV GGYIANSLAI MTDALHMLTD LSAIILTLLA LWLSSKSPTK RFTFGFHRLE 180 

V VLSAMISVLL VYILMGFLLY EAVQRTIHMN YEINGDIMLI TAAVGVAVNV IMGFLLNQSG 240 

■'*"* HRHSHSHSLP SNSPTRGSGC ERNHGQDSLA VRAAFVHALG DLVQSVGVLI AAYIIRFKPE 300 

>, * YKIADPICTY VFSLLVAFTT FRIIWDTWI ILEGVPSHLN VDYIKEALMK IEDVYSVEDL 360 

: ;^5 NIWSLTSGKS TAIVHIQLIP GSSSKWEEVQ SKANHLLLNT FGMYRCTIQL QSYRQEVDRT 42 0 
.-'I- CANCQSSSP 

SEQ ID NO:25 PAA3 DNA SEQUENCE 

50 Nucleic Acid Accession #: AB037765 

Coding sequence: 375-2798 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

55 j | ( i I I 

GCCGAGTCGG TGGCGGCTGC AGGCTGGGAG GGAGAAGTGC TAC GCCTTTG CAGGTTGGCG 60 

AAGTGGTTCC AGGCTACCCG GCTAGTCTGG CACGGCCCCG TCTTCTGCC T CCTCCTCCGT 120 

CGCGTGGCGG CGGGAACTGT TGGCCGCGCG GCCTCGGGAA CGGCCCAGGT CCCCGCCCGC 180 

AGGTCCCGGG CAGATAACAT AGATCATCAG TAGAAAACTT CTTGAAGTTG TTCAAGAAAA 240 

60 ATTTGAAAGT AGCAAAATAG AAAATAAAGA ATTAACAGCA GATACAGAGG ACAGCATGGA 300 

AGTGTTGTCT TAGGAAACAG AACACAGCAG TGAAAAAACA GACAAAATCC GCTCAGATAC 360 

AACTGCAGCT GATAATGTTT TCCGGCTTCA ATGTC TTTAG AGTTGGGATC TCTTTTGTCA 420 

TAATGTGCAT TTTTTACATG CCAACAGTAA ACTCTTTACC AGAAC TGAGT CCTCAGAAAT 480 

ATTTTAGTAC ATTGCAACCA GGTCTTGAAG AACTGAATGA GGCTGTTAGA CCTCTGCAGG 540 

65 ACTATGGAAT TTCAGTTGCC AAGGTTAATT GTGTCAAAGA AGAAATATCA AGATACTGTG 600 

GAAAAGAAAA GGATTTGATG AAAGCATATT TATTCAAGGG CAACATATTG CTCAGAGAAT 660 

TCC CTACTGA CACCTTGTTT GATGTGAATG CCATTGTCGC CCATGTTCTC TTTGCTCTTC 72 0 

TTTTTAGTGA AGTGAAATAT ATTACCAACC TGGAAGACCT TCAGAACATA GAAAATGCTC 780 

TGAAAGGAAA AGCAAATATT ATATTCTCAT ATGTAAGAGC CATTGGAATA C C AGAGC AC A 840 

70 GAGCAGTCAT GGAAGCCGGT TTTGTGTATG GGACTACATA CCAATTTGTC TTAACCACAG 900 

AAATTGCCCT TTTGGAAAGT ATTGGCTCTG AGGATGTGGA ATATGCACAT CTCTACTTTT 960 

TTCATTGTAA ACTAGTCTTG GACTTGACCC AGCAATGTAG AAGAACACTA ATGGAACAGC 1020 

CATTGACTAC ACTGAACATT CACCTGTTTA TTAAGACAAT GAAAGCACCT CTGTTGACTG 1080 

AAGTTGCTGA AGATCCTCAA CAAGTTTCAA CTGTCCATCT CCAACTGGGC TTACCACTGG 1140 

75 TTTTTATTGT TAGCCAACAG GCTACTTATG AAGCTGATAG AAGAACTGCA GAATGGGTTG 1200 

CTTGGC GTCT TCTGGGAAAA GCAGGAGTTC TACTCTTGTT AAGGGACTCT TTGGAAGTGA 1260 

ACATTCCTCA AGATGCTAAT GTGGTCTTCA AAAGAGCAGA AGAGGGAGTT CCAGTGGAAT 1320 

TTTTGGTATT ACATGATGTT GATTTAATAA TATCTCATGT GGAAAATAAT ATGCACATTG 1380 

AGGAAATACA AGAAGATGAA GACAATGACA TGGAAGGTCC AGATATAGAT GTTCAGGATG 1440 

80 ATGAAGTGGC AGAAACTGTT TTCAGAGATA GGAAGAGAAA ATTACCTTTG GAACTTACAG 1500 
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TGGAACTAAC AGAAGAAACA TTTAATGCAA CAGTGATGGC TTCTGACAGC ATAGTACTCT 1560 

TCTATGCTGG TTGGCAAGCA GTATCCATGG CATTTTTGCA ATCCTATATT GATGTGGCAG 1620 

TTAAACTGAA AGGCACATCT ACTATGCTTC TTACTAGAAT AAACTGTGCA GATTGGTCTG 1680 

ATGTATGTAC TAAGCAAAAT GTTACTGAAT TTCCTATCAT AAAGATGTAC AAGAAAGGCG 17 40 

AGAAC CCAGT ATCTTATGCT GGAATGTTAG GAACCAAAGA TCTCCTAAAA TTTATCCAGC 1800 

TCAACAGGAT TTCATATCCA GTGAATATAA CATCGATCCA AGAAGCAGAA GAATATTTAA 1860 

GTGGGGAATT ATATAAAGAC CTCATCTTGT ATTCTAGTGT GTCAGTATTG GGACTATTTA 1920 

GTCCAACCAT GAAAACAGCA AAAGAAGATT TTAGTGAAGC AGGAAACTAC CTAAAAGGAT 1980 

ATGTTATCAC TGGAATTTAT TCTGAAGAAG ATGTTTTGCT ACTGTCAACC AAATATGCTG 2040 

CAAGTCTTCC AGCCCTGCTG CTTGCCAGAC ACACAGAAGG CAAAATAGAG AGCATCCCAC 2100 

TAGCTAGCAC ACATGCACAA GACATAGTTC AAATAATAAC AGATGCACTA CTGGAAATGT 2160 

TTCCGGAAAT CACTGTGGAA AATCTTCCCA GTTATTTCAG ACTTCAGAAA CCATTATTGA 2220 

TTTTGTTCAG TGATGGC AC T GTAAATCCTC AATATAAAAA AGCAATATTG AC AC TGGTAA 2280 

AGCAGAAATA CTTGGATTCA TTTACTCCAT GCTGGTTAAA TCTAAAGAAT ACTCCAGTGG 2340 

GGAGAGGAAT CTTGCGGGCA TATTTTGATC CTCTGCCTCC CCTTCCTCTT CTTGTTTTGG 2400 

TGAATCTGCA TTCAGGTGGC CAAGTATTTG CATTTCCTTC AGACCAGGCT ATAATTGAAG 2460 

AAAACCTTGT ATTGTGGCTG AAGAAATTAG AAGCAGGACT AGAAAATCAT ATCACAATTT 2520 

TACCTGCTCA AGAATGGAAA CCTCCTCTTC CAGCTTATGA TTTTCTAAGT ATGATAGATG 2580 

CCGCAACATC TCAACGTGGC ACTAGGAAAG TTCCCAAGTG TATGAAAGAA ACAGATGTGC 2640 

AGGAGAATGA TAAGGAACAA CATGAAGATA AATCGGCAGT CAGAAAAGAA CCGATTGAAA 2700 

CTCTGAGAAT AAAGCATTGG AATAGAAGTA ATTGGTTTAA AGAAGCAGAA AAATCATTTA 2760 

GACGTGATAA AGAGTTAGGA TGCTCAAAAG TGAACTAATT TTATAGGGCT GTGGTTTCCA 2820 

AAATTTTTTT GGCATGATAG ACTTAATTTA TTTCCTTAAA GAATAATATT AAATCATTTC 2880 

AAGTTTGCAG ACTAGTGCCA TCCAATAGAA TTATAATATA AGTCACATAT TTTATTTAAA 2940 

ATTTTCTAGT AACTACATTA AACAAAGTAA AAGTGAGCAG GGCAAAATAA TTTTGATATT 3000 

ACTTTTCACC CAGTAGTATA CCCAAAATAG CGAAATATAG AAATTATTAA TGAGATATTT 306 0 

TACATCCTTT TTTGTACCAA GTCTTCTAAA TGCAGTACAT ATTTTATACT TACTGCATTT 3120 

CTTACTTCCG AGTAGCCATA TTTCAAGTGT TCATTGCCAC ATGTGGCCTG TGACTACTGT 3180 

ATTGGACAGT TC AGTAC TAG ACAAAAACTA GCATAATTAA CTTAGTTCTA GCCATGATTT 3240 

CTATTTGGAT TAAAATTAAA CTCTAATCAC AGTTAACTCC ACAGTGCATT CATGCAGCTG 3300 

ACAGTTATAT TTGTTTTATT GGAGTCATGA TATTAAAATC AGCGTTTGTC AACCTCAGGG 3360 

GATATTTAGC AATTGTCGGG AGACATTTTT GATGTCATGA CTAGGGCAGT TATTGACATT 3420 

TAGTGAGTAG AGGCCATGGA TCCTGCTAAA TAACCTGCAT TGGACAGCGC CCCACAACAA 34 80 

AGAATTATCC TGCCCGAAAT GGTAGTCGTG CCAAGGCTGA GTAACCTTGT GTTAAAAGTA 3540 

ACCTGTGGCA GACTAGGTTT CCAGAATTTC CTGGTTCTGC TCACGTATCA TGTTTGAAAA 3600 

AATTTTGGCT ATTAAAGATA TGTATTAGAT GGTCTTATCC TGATTATTAC CTGGATACAA 3660 

CTTGATCTTT TCTAATATTT TCAGAAAGTG ATGGGATAAC CCTAGAAGAG GACTCAGAAT 372 0 

GATATTTATA TTTTAAGTGA GTCTTAAAAC CTCCTCTTAT TTCTACAAGT TATATGGCTA 3780 

AATTTCAGAT TGAACAGGGA TTCAGCATTC TGCCATCTCC TCATGGAAAG AGAGGCTCCC 3840 

TCATCTGAAG CGTCTCTGAA ATCTACCCTT GCAAGCTTCA GACAAATCAG TTGATCTCCC 3900 

TGAGCCACAC GGCCTCATTC TGTGAGGGAG GGAAAGATTA GCCAAAGAGT TAATTTTCAT 3960 

TCCAAATCAC TTAGCTGTTA GACTGATCTG TTTGTAGCAG TTGTTTGTCT CATTTTTGCT 402 0 

CTGTGCATTT TTTGAGACAT TTGTTGAGAA TATTCTATTT GGTGCTCTAC TGTATTTTTC 4080 

TTTTTAATAT C TACTTG AT A TCTTGTTCTT TAAATTTTCT TCACATATGG TTTGCCTGAT 4140 

ACAACTGATT TTTATAACTG AAATTTAAGG AATC TAAC AG CTAAAACTCA GTAAGTGCAT 4200 

MTATTTC CTT ATAACATAGA CCCGTTGCTA CTCTCAGCAC CCTCTCCTCA ATTTTTTTTC 4260 

CTGTAGCATG TGATGCCTGA TTAAACTCAT TTTCATTTGC TTTTATTTCT AATATGGGAA 4320 

CAATGAGAGT GAACTCTAAA TATAGGTTGT AGTAATAAAA CATCATTAGC CTAATTATTA 43 80 

GAAAATGCTA ATTAAGTACC AGCACATAGA AACATGAAAT TGCTTAGTCA TTGTACCTTT 4440 

GTCAGCAATT TTGACAGTCA TTAATGTTTG TCATAATTTT AAATAAAGTG TCTGGGTTTC 4500 
AGAATACCTT CAAAAAAAAA AAAAAA 



SEQ ID NO;26 PAA3 Protein sequence: 
Protein Accession #: BAA92582 

1 11 21 31 41 51 

j I I i I i 

MFSGFNVFRV GISFVIMCIF YMPTVNSLPE LSPQKYFSTL QPGLEELNEA VRPLQDYGIS 60 
VAKVNCVKEE I SRYCGKEKD LMKAYLFKGN ILLREFPTDT LFDVNAIVAH VLFALLFSEV 120 
KYITNLEDLQ N I ENALKGKA NIIFSYVRAI GIPEHRAVME AGFVYGTTYQ FVLTTEIALL 180 
ESIGSEDVEY AHLYFFHCKL VLDLTQQCRR TLMEQPLTTL NIHLFIKTMK APLLTEVAED 240 
PQQVSTVHLQ LGLPLVFIVS QQATYEADRR TAEWVAWRLL GKAGVLLLLR DSLEVNIPQD 300 
ANWFKRAEE GVPVEFLVLH DVDLIISHVE NNMHIEEIQE DEDNDMEGPD I DVQDDEVAE 360 
TVFRDRKRKL PLELTVELTE ETFNATVMAS DSIVLFYAGW QAVSMAFLQS YIDVAVKLKG 420 
TSTMLLTRIN CADWSDVCTK QNVTEFPIIK MYKKGENPVS YAGMLGTKDL LKFIQLNRIS 480 
YPVNITSIQE AEEYLSGELY KDLILYSSVS VLGLFSPTMK TAKEDFSEAG NYLKGYVITG 540 
IYSEEDVLLL STKYAASLPA LLLARHTEGK IESIPLASTH AQDIVQIITD ALLEMFPEIT 600 
VENLPSYFRL QKPLLILFSD GTVNPQYKKA ILTLVKQKYL DSFTPCWLNL KNTFVGRGIL 660 
RAYFDPLPPL PLLVLVNLHS GGQVFAFPSD QAIIEENLVL WLKKL EAGLE NHITILPAQE 720 
WKPPLPAYDF LSMIDAATSQ RGTRKVPKCM KETDVQENDK EQHEDKSAVR KEPIETLRIK 780 
HWNRSNWFKE AEKSFRRDKE LGCSKVN 

SEQ ID NO:27 PAA5 DNA SEQUENCE 

Nucleic Acid Accession #: NM_0 1 2449 

Coding sequence: 66-1085 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I E I I I I 

CCGAGACTCA CGGTCAAGCT AAGGCGAAGA GTGGGTGGCT GAAGCCATAC TATTTTATAG 60 
AATTAATGGA AAGCAGAAAA GACATCACAA ACCAAGAAGA ACTTTGGAAA ATGAAGCCTA 120 
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GGAGAAATTT AGAAGAAGAC GATTATTTGC ATAAGGACAC GGGAGAGACC AGCATGCTAA 180 

AAAGACCTGT GCTTTTGCAT TTGCACCAAA CAGCCCATGC TGATGAATTT GACTGCCCTT 240 

CAGAACTTCA GCACACACAG GAACTCTTTC CACAGTGGCA CTTGCCAATT AAAATAGCTG 300 

CTATTATAGC ATCTCTGACT TTTC TTTAC A CTCTTCTGAG GGAAGTAATT CACCCTTTAG 360 

CAACTTCCCA TCAACAATAT TTTTATAAAA TTCCAATCCT GGTCATCAAC AAAGTCTTGC 420 

CAATGGTTTC CATCACTCTC TTGGCATTGG TTTACCTGCC AGGTGTGATA GCAGCAATTG 480 

TCCAACTTCA TAATGGAACC AAGTATAAGA AGTTTCCACA TTGGTTGGAT AAGTGGATGT 540 

TAACAAGAAA GCAGTTTGGG CTTCTCAGTT TCTTTTTTGC TGTACTGCAT GCAATTTATA 600 

GTCTGTCTTA CCCAATGAGG CGATCCTACA GATACAAGTT GCTAAACTGG GCATATCAAC 660 

AGGTCCAACA AAATAAAGAA GATGCCTGGA TTGAGCATGA TGTTTGGAGA ATGGAGATTT 720 

ATGTGTCTCT GGGAATTGTG GGATTGGCAA TACTGGCTCT GTTGGCTGTG ACATCTATTC 7 80 

CATCTGTGAG TGACTCTTTG ACATGGAGAG AATTTCACTA TATTCAGAGC AAGCTAGGAA 840 

TTGTTTCCCT TCTACTGGGC AC AAT AC AC G CATTGATTTT TGCC TGGAAT AAGTGGATAG 900 

ATATAAAACA ATTTGTATGG TATACACCTC CAACTTTTAT GATAGCTGTT TTCCTTCCAA 960 

TTGTTGTCCT GATATTTAAA AGCATACTAT TCCTGCCATG CTTGAGGAAG AAGATACTGA 1020 

AGATTAGACA TGGTTGGGAA GACGTCACCA AAATTAACAA AACTGAGATA TGTTCCCAGT 1080 

TGTAGAATTA CTGTTTACAC ACATTTTTGT TCAATATTGA TATATTTTAT CACCAACATT 1140 
TCAAGTTTGT ATTTGTTAAT AAAATGATTA TTCAAGGAAA AAAAAAAAAA AAAAA 

SEQ ID NO:28 PAA5 Protein sequence 
Protein Accession #: np_03658 l 

1 11 21 31 41 51 

I ! I i I I 

MESRKDITNQ EELWKMKPRR NLEEDDYLHK DTGETSMLKR FVLLHLHQTA HADEFDCPSE 60 

LQHTQELFPQ WHLPIKIAAI IASLTFLYTL LREVIH PL AT SHQQYFYKIP ILVINKVLPM 120 

VSITLLALVY LPGVIAAIVQ LHNGTKYKKF PHWLDKWMLT RKQFGLLSFF FAVLHAIYSL 180 

SYPMRRSYRY KLLNWAYQQV QQNKEDAWIE HDVWRMEIYV SLGIVGLAIL ALLAVTSIPS 240 

VSDSLTWREF HYIQSKLGIV SLLLGTIHAL IFAWNKWIDI KQFVWYTPPT FMIAVFLPIV 300 
VLIFKSILFL PCLRKKILKI RHGWEDVTKI NKTEICSQL 

SEQ ID NO:29 PAA7 DNA SEQUENCE 

Nucleic Acid Accession #: NM_030774 

Coding sequence: 1-963 (underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

| I ! i I I 

ATGAGTTCCT GCAACTTCAC ACATGCCACC TTTGTGCTTA TTGGTATCCC AGGATTAGAG 60 

AAAGCCCATT TCTGGGTTGG CTTCCCCCTC CTTTCCATGT ATGTAGTGGC AATGTTTGGA 12 0 

AACTGCATCG TGGTCTTCAT CGTAAGGACG GAACGCAGCC TGCACGCTCC GATGTACCTC 180 

TTTCTCTGCA TGCTTGCAGC CATTGACCTG GCCTTATCCA CATCCACCAT GCCTAAGATC 240 

CTTGCCCTTT TCTGGTTTGA TTCCCGAGAG ATTAGCTTTG AGGCCTGTCT TACCCAGATG 300 

TTCTTTATTC ATGCCCTCTC AGCCATTGAA TCCACCATCC TGCTGGCCAT GGCCTTTGAC 360 

CGTTATGTGG CCATCTGCCA CCCACTGCGC CATGCTGCAG TGCTCAACAA TACAGTAACA 420 

GCCCAGATTG GCATCGTGGC TGTGGTCCGC GGATCCCTCT TTTTTTTCCC ACTGCCTCTG 480 

CTGATCAAGC GGCTGGCCTT CTGCCACTCC AATGTCCTCT CGCACTCCTA TTGTGTC C AC 540 

CAGGATGTAA TGAAGTTGGC CTATGCAGAC ACTTTGCCCA ATGTGGTATA TGGTCTTACT 600 

GCCATTCTGC TGGTCATGGG CGTGGACGTA ATGTTCATCT CCTTGTCCTA TTTTCTGATA 66 0 

ATACGAACGG TTCTGCAACT GCCTTCCAAG TCAGAGCGGG CCAAGGCCTT TGGAACCTGT 720 

GTGTCACACA TTGGTGTGGT ACTCGCCTTC TATGTGCCAC TTATTGGCCT CTCAGTGGTA 780 

CACCGCTTTG GAAACAGCCT TCATCCCATT GTGCGTGTTG TCATGGGTGA CATCTACCTG 840 

CTGCTGCCTC CTGTCATCAA TCCCATCATC TATGGTGCCA AAACCAAACA GATCAGAACA 900 

CGGGTGCTGG CTATGTTCAA GATCAGCTGT GACAAGGACT TGCAGGCTGT GGGAGGCAAG 960 

TGA CCCTTAA CACTACACTT CTCCTTATCT TTATTGGCTT GATAAACATA ATTATTTCTA 1020 

ACACTAGCTT ATTTCCAGTT GCCCATAAGC ACATCAGTAC TTTTCTCTGG CTGGAATAGT 1080 

AAACTAAAGT ATGGTACATC TACCTAAAGG ACTATTATGT GGAATAATAC ATACTAATGA 1140 

AGTATTACAT GATTTAAAGA CTACAATAAA ACCAAACATG CTTATAACAT TAAGAAAAAC 1200 

AATAAAGATA CATGATTGAA ACCAAGTTGA AAAATAGCAT ATGCCTTGGA GGAAATGTGC 1260 

TCAAATTACT AATGATTTAG TGTTGTCCCT ACTTTCTCTC TCTTTTTTCT TTCTTTTTTT 1320 

TTTATTATGG TTAGCTGTCA CATACAACTT TTTTTTTTTT TGAGATGGGG TCTCGCTCTG 1380 

TCACCAGGCT GGAGTGCAGT GGCGCGATCT CGGCTCACTG CAACCTCCAC ATCCCATGTT 1440 

GAAGTAATTC TTCTGCCTCA GCCTCCCGAG TAGCTGGGAC TAGAGGAACG TGCCACCATG 1500 

ACTGGCTAAT TTTCTGTATT TTTTAGTAGA GACAGAGTTT CACCATGTTG GCCAGGATGG 1560 

TCTCGATCTC CTGACCTTGT GATCCACCCG CCTCAGCCTC CCAAAGTGTT GGGATTACAG 1620 

GTGTGAACCA CTGTGCCCGG CCTGTGTACA ACTTTTTAAA TAGGGAATAT GATAGCTTCG 1680 

CATGGTGGTG TGCACCTATA GCCCCCACTG CCTGGAAAGC TGAGGTGGGA GAATCGCTTG 1740 

AGTCCAGGAG TTTGAGGTTA CAGTGATCCA CGATCGTACC ACTACACTCC AGCCTGGGCA 1800 

ACAGAGCAAG ACCCTGTCTC AAAGCATAAA ATGGAATAAC ATATCAAATG AAACAGGGAA 1860 

AATGAAGCTG ACAATTTATG GAAGCCAGGG CTTGTCACAG TCTCTACTGT TATTATGCAT 1920 

TACCTGGGAA TTTATATAAG CCCTTAATAA TARTGCCAAT GAACATCTCA TGTGTGCTCA 1980 

CAATGTTCTG GCACTATTAT AAGTGCTTCA CAGGTTTTAT GTGTTCTTCG TAACTTTATG 2040 

GAGTAGGTAC CATTTGTGTC TCTTTATTAT AAGTGAGAGA AATGAAGTTT ATATTATCAA 2100 

GGGGACTAAA GTCACACGGC TTGTGGGCAC TGTGCCAAGA T TT AAAATT A AATTTGATGG 2160 

TTGAATACAG TTACTTAATG ACCATGTTAT ATTGCTTCCT GTGTAACATC TGCCATTTAT 2220 

TTCCTCAGCT GTACAAATCC TCTGTTTTCT CTCTGTTACA CACTAACATC AATGGCTTTG 2280 

TACTTGTGAT GAGAGATAAC CTTGCCCTAG TTGTGGGCAA CACATGCAGA ATAATCCTGT 2340 

TTTAC AGCTG CCTTTCGTGA TCTTATTGCT TGCTTTTTTC CAGATTCAGG GAGAATGTTG 2400 

TTGTCTATTT GTCTCTTACA TCTCCTTGAT CATGTCTTCA TTTTTTAATG TGCTCTGTAC 2460 

CTGTCAAAAA TTTTGAATGT AC AC C AC ATG CTATTGTCTG AACTTGAGTA TAAGATAAAA 2520 
TAAAATTTTA TTTTAAATTT T 
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SEP ID NO;30 PAA7 PROTEIN SEQUENCE 

Protein Accession*: NP_1 10401 



1 11 21 31 41 51 

I I I I I I 

MSSCNFTHAT FVLIGIPGLE KAHFVJVGF PL LSMYWAMFG NCIWFIVRT ERSLHAPMYL 
FLCMLAAIDL ALSTSTMPKI LALFWFDSRE ISFEACLTQM FFIHALSAIE STILLAMAFD 
RYVAICHPLR HAAVLNNTVT AQIGIVAWR GSLFFFPLPL LIKRLAFCHS NVLSHSYCVH 
QDVMKLAYAD TLPNWYGLT AILLVMGVDV MFISLSYFLI IRTVLQLPSK SERAKAFGTC 
VSHIGWLAF YVPLIGLSW HRFGNSLHPI VRWMGDIYL LLPPVINPII YGAKTKQIRT 
RVLAMFKISC DKDLQAVGGK 

SEQ ID N0:31 PAV6 DNA SEQUENCE 

Nucleic Acid Accession #: XM_050837 

Coding sequence: 1-1020 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

1 I I i i t 

ATGAACTGGG AGCTGCTGCT GTGGCTGCTG GTGCTGTGCG CGCTGCTCCT GCTCTTGGTG 
CAGCTGCTGC GCTTCCTGAG GGCTGACGGC GACCTGACGC TACTATGGGC CGAGTGGCAG 
GGACGACGCC CAGAATGGGA GCTGACTGAT ATGGTGGTGT GGGTGACTGG AGCCTCGAGT 
GGAATTGGTG AGGAGCTGGC TTACCAGTTG TCTAAACTAG GAGTTTCTCT TGTGCTGTCA 
GCCAGAAGAG TGCATGAGCT GGAAAGGGTG AAAAGAAGAT GCCTAGAGAA TGGCAATTTA 
AAAGAAAAAG ATATACTTGT TTTGCCCCTT GACCTGACCG ACACTGGTTC CCATGAAGCG 
GCTACCAAAG CTGTTCTCCA GGAGTTTGGT AGAATCGACA TTCTGGTCAA CAATGGTGGA 
ATGTCCCAGC GTTCTCTGTG CATGGATACC AGCTTGGATG TCTACAGAAA GCTAATAGAG 
CTTAACTACT TAGGGACGGT GTCCTTGACA AAATGTGTTC TGCCTCACAT GATCGAGAGG 
AAGCAAGGAA AGATTGTTAC TGTGAATAGC ATCCTGGGTA TCATATCTGT ACCTCTTTCC 
ATTGGATACT GTGCTAGCAA GCATGCTCTC CGGGGTTTTT TTAATGGCCT TCGAACAGAA 
CTTGCCACAT ACCCAGGTAT AATAGTTTCT AACATTTGCC CAGGACCTGT GCAATCAAAT 
ATTGTGGAGA ATTCCCTAGC TGGAGAAGTC ACAAAGACTA TAGGCAATAA TGGAGACCAG 
TCCCACAAGA TGACAACCAG TCGTTGTGTG CGGCTGATGT TAATCAGCAT GGCCAATGAT 
TTGAAAGAAG TTTGGATCTC AGAACAACCT TTCTTGTTAG TAACATATTT GTGGCAATAC 
ATGCCAACCT GGGCCTGGTG GATAACCAAC AAGATGGGGA AGAAAAGGAT TGAGAACTTT 
AAGAGTGGTG TGGATGCAGA CTCTTCTTAT TTTAAAATCT TTAAGACAAA ACATGACTGA 

SEQ ID NO:32 PAV6 Protein sequence 
Protein Accession #: XP 050837 

1 11 21 31 41 51 

I I ! I 1 i 

MNWELLLWLL VLCALLLLLV QLLRFLRADG DLTLLWAEWQ GRRPEWELTD MWWVTGASS 
GIGEELAYQL SKLGVSLVLS ARRVHELERV KRRCLENGNL KEKDILVLPL DLTDTGSHEA 
ATKAVLQEFG RIDILVNNGG MSQRSLCMDT SLDVYRKLIE LNYLGTVSLT KCVLPHMIER 
KOGKIVTVNS ILGIISVPLS IGYCASKHAL RGF FNGLRTE LATYPGIIVS NICFGPVGSN 
IVENSLAGEV TKTIGNNGDQ SHKMTTSRCV RLMLISMAND LKEVWISEQP FLLVTYLWQY 
MPTWAWWITN KMGKKRIENF KSGVDADSSY FKIFKTKHD 

SEQ ID NO:33 PBA6 DNA SEQUENCE 

Nucleic Acid Accession #: NM.006853 

Coding sequence: 26-874 {underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

! t I I I ! 

AGGAATCTGC GCTCGGGTTC CGCAG ATG CA GAGGTTGAGG TGGCTGCGGG ACTGGAAGTC 
ATCGGGCAGA GGTCTCACAG CAGCCAAGGA ACCTGGGGCC CGCTCCTCCC CCCTCCAGGC 
CATGAGGATT CTGCAGTTAA TCCTGCTTGC TCTGGCAACA GGGCTTGTAG GGGGAGAGAC 
CAGGATCATC AAGGGGTTCG AGTGCAAGCC TCACTCCCAG CCCTGGCAGG CAGCCCTGTT 
CGAGAAGACG CGGCTACTCT GTGGGGCGAC GCTCATCGCC CCCAGATGGC TCCTGACAGC 
AGCCCACTGC CTCAAGCCCC GCTACATAGT TCACCTGGGG CAGCACAACC TCCAGAAGGA 
GGAGGGCTGT GAGCAGACCC GGACAGCCAC TGAGTCCTTC CCCCACCCCG GCTTCAACAA 
CAGCCTCCCC AACAAAGACC AC CGC AATGA CATCATGCTG GTGAAGATGG CATCGCCAGT 
CTCCATCACC TGGGCTGTGC GACCCCTCAC CCTCTCCTCA CGCTGTGTCA CTGCTGGCAC 
CAGCTGCCTC ATTTCCGGCT GGGGCAGCAC GTCCAGCCCC CAGTTACGCC TGCCTCACAC 
CTTGCGATGC GCCAACATCA CCATCATTGA GCACCAGAAG TGTGAGAACG CCTACCCCGG 
CAACATCACA GACACCATGG TGTGTGCCAG CGTGCAGGAA GGGGGCAAGG ACTCCTGCCA 
GGGTGACTCC GGGGGCCCTC TGGTCTGTAA CCAGTCTCTT CAAGGCATTA TCTCCTGGGG 
CCAGGATCCG TGTGCGATCA CCCGAAAGCC TGGTGTCTAC ACGAAAGTCT GCAAATATGT 
GGACTGGATC CAGGAGACGA TGAAGAACAA TTAG ACTGGA CCCACCCACC ACAGCCCATC 
ACCCTCCATT TCCACTTGGT GTTTGGTTCC TGTTCACTCT GTTAATAAGA AACCC TAAGC 
CAAGACCCTC TACGAACATT CTTTGGGCCT CCTGGACTAC AGGAGATGCT GTCACTTAAT 
AATCAACCTG GGGTTCGAAA TCAGTGAGAC CTGGATTCAA ATTCTGCCTT GAAATATTGT 
GACTCTGGGA ATGACAACAC CTGGTTTGTT CTCTGTTGTA TCCCCAGCCC CAAAGACAGC 
TCCTGGCCAT ATATCAAGGT TTCAATAAAT ATTTGCTAAA TGAGTG 



SEP ID NO:34 PBA6 PROTEIN SEQUENCE 



Protein Accession #: NP_006844 



313 



Attorney Docket No.: 018501-004200US 



10 
15 
20 
25 

AO 

50 
55 
60 
65 
70 
75 
80 



1 11 21 31 41 51 

1 i I I I I 

MRILQLILLA LATGLVGGET RIIKGFECKP HSQPWQAALF EKTRLLCGAT LIAPRWLLTA 
AHCLKPRYIV HLGQHNLQKE EGCEQTRTAT ESFPHPGFNN SLPNKDHRND IMLVKMASPV 
SITWAVRPLT LSSRCVTAGT SCLISGWGST SSPQLRLFHT LRCANITIIE HQKCENAYPG 
NITDTMVCAS VQEGGKDSCQ GDSGGPLVCN QSLQGIISWG QDPCAITRKP GVYTKVCKYV 
DWIQETMKNN 

SEQ ID NO:35 PBC1 DNA SEQUENCE 

Nucleic Acid Accession #: NM_001775 

Coding sequence: 70-972 (underlined sequences correspond to start and stop codons) 



CTAAAGCTCT 
TGGAGCCCTA_ 
CTCTCTAGGA 
GTGCTCGCGG 
CGCTTTCCCG 
AGACATGTAG 
CCTTGCAACA 
CCTTGCAACA 
GTCCAGCGGG 
ACATGGTGTG 
AAGGACTGCA 
GAAGCTGCCT 
AAAAACAGCA 
CTAGAGGCCT 
ACCATAAAAG 
ATCTACAGAC 
TCTGAGATCT_ 
CATCATACAT 
CAATAAGGTC 
CATCAGCATA 
AATGAAAATT 



11 

! 

CTTGCTGCCT 
TGGCCAACTG 
GAGCCCAACT 
TGGTCGTCCC 
AGACCGTCCT 
ACTGCCAAAG 
TTACTGAAGA 
AGATTCTTCT 
ACATGTTCAC 
GTGAATTCAA 
GCAACAACCC 
GTGATGTGGT 
CTTTTGGGAG 
GGGTGATACA 
AGCTGGAATC 
CTGACAAGTT 
GAGCCAGTCG 
GACTCAGCAT 
AATGCCAGAG 
CCTTTATTGT 
GTATGTTAAG 



21 
I 

AGCCTCCTGC 
CGAGTTCAGC 
CTGTCTTGGC 
GAGGTGGCGC 
GGCGCGATGC 
TGTATGGGAT 
AGACTATCAG 
TTGGAGCAGA 
CCTGGAGGAC 
CACTTCCAAA 
TGTTTCAGTA 
CCATGTGATG 
TGTGGAAGTC 
TGGTGGAAGA 
GATTATAAGC 
TCTTCAGTGT 
CTGTGGTTGT 
ACCTGCTGGT 
ACGGAAGCCT 
GATCTATCAA 
TTACTTCCTT 



31 

i 

CGGCCTCATC 
CCGGTGTCCG 
GTCAGTATCC 
CAGACGTGGA 
GTCAAGTACA 
GCTTTCAAGG 
CCACTAATGA 
ATAAAAGATC 
ACGCTGCTAG 
ATAAACTATC 
TTC TGGAAAA 
CTCAATGGAT 
CATAATTTGC 
GAAGATTCCA 
AAAAGGAATA 
GTGAAAAATC 
TTTAGCTCCT 
GCAGAGCTGA 
TTTTCCCCAA 
TAGTCAAGAA 
TAG 



41 

I 

TTCGCCCAGC 
GGGACAAACC 
TGGTCCTGAT 
GCGGTCCGGG 
CTGAAATTCA 
GTGCATTTAT 
AGTTGGGAAC 
TGGCCCATCA 
GCTACCTTGC 
AATCTTGCCC 
CGGTTTCCCG 
CCCGCAGTAA 
AACCAGAGAA 
GAGACTTATG 
TTCAATTTTC 
CTGAGGATTC 
TGACTCCTTG 
AGATTTTGGA 
AGTCTTAAAA 
AAATTATTGT 



51 

1 

CAACCCCGCC 
CTGCTGCCGG 
CCTCGTCGTG 
CACCACCAAG 
TCCTGAGATG 
TTCAAAACAT 
TCAGACCGTA 
GTTCACACAG 
TGATGACCTC 
AGACTGGAGA 
CAGGTTTGCA 
AATCTTTGAC 
GGTTCAGACA 
CCAGGATCCC 
CTGCAAGAAT 
ATCTTGCACA 
TGGTTTATGT 
GGGTCCTCCA 
TAACTTATAT 
ATAAGATTAG 



SEQ ID NO:36 PBC1 Protein sequence 
Protein Accession #: NP_001 766 



MANCEFSPVS 
ETVLARCVKY 
KILLWSRIKD 
SNNFVSVFWK 
WVIHGGREDS 



11 

I 

GDKPCCRLSR 
TEIHPEMRHV 
LAHQFTQVQR 
TVSRRFAEAA 
RDLCQDPTIK 



21 

I 

RAQLCLGVSI 
DCQSVWDAFK 
DMFTLEDTLL 
CDWHVMLNG 
ELESIISKRN 



31 

I 

LVLILWVLA 
GAFISKHPCN 
GYLADDLTWC 
SRSKIFDKNS 
IQFSCKNIYR 



41 

i 

VWPRWRQTW 
ITEEDYQPLM 
GEFNTSKINY 
TFGSVEVHNL 
PDKFLQCVKN 



51 

i 

SGPGTTKRFP 
KLGTQTVPCN 
QSCPDWRKDC 
QPEKVQTLEA 
PEDSSCTSEI 



1 
I 

ATGTCCTTTC 
ACCCGGACCC 
TTGGTGAATT 
TCCAAGGCCA 
ACCCAGATCA 
GACGCCTTTG 
TCCTGCGACA 
ACACCCAACC 
ATGCGCAAGA 
ACGGGAGGCA 
ATCAGCAGGA 
TCCAACCGGG 
CTTATGGATG 
CTGCTCGTGG 
CTAGAGAAGT 
ATTGTGTGTT 
AAAAATAAAA 
AGCCTGGTGG 
TTTTTACCCC 
CTCAAAGAAA 
GATGAAATTG 
CAAGACAAGG 
TTAGCCAATG 
GTCATGTTTA 
GGCTTGAACC 
TTCAGCACGC 



11 

I 

GGGCAGCCAG 
TGTACTC C AG 
TTATTCAAGC 
CGGAGAATGT 
ACCAAAGTGA 
GGGATATTCA 
CGGACGCGGA 
TGGTCATTTC 
TCTTCAGCCG 
CCCATTATGG 
GTTCAGAGGA 
ACACCCTCAT 
ACTTCACAAG 
ACAATGGCTG 
ATATCTCTGA 
TTGCCCAAGG 
TTCCTTGTGT 
AGGTGGAGGA 
GCACGGTGTC 
TTCTCGAATG 
TGAGCAATGC 
ATAACTGGAA 
ATGAGATTTT 
CGGCTCTCAT 
TACGGAAGTT 
TTGTGTACCG 



21 

i 

GCTCAGCATG 
CGC GTCTCGG 
AAATTTTAAG 
GTGCAAGTGT 
GAAATGGAAC 
GTTTGAGACA 
AATCCTTTAC 
TGTGACCGGG 
GCTCATCTAC 
CCTGATGAAG 
GAATATTGTG 
CAGGAATTGC 
AGATCCACTG 
TCATGGACAT 
GCGCACTATT 
AGGTGGAAAA 
GGTGGTGGAA 
TGCCCTGACA 
CCGGCTGCCT 
TTCTCACCTA 
CATCTCCTAC 
TGGGCAGCTG 
CACCAATGAC 
AAAGGACAGA 
TCTCACCCAT 
GAATCTGCAG 



31 

I 

AGGAACAGAA 
AGCACAGACT 
AAACGAGAAT 
GGCTATGCCC 
TACAAGAAAC 
CTGGGGAAGA 
GAGCTGCTGA 
GGCGCCAAGA 
ATCGCGCAGT 
TACATCGGGG 
GCCATTGGCA 
GATGCTGAGG 
TATATCCTGG 
CCCACTGTCG 
CAAGATTCCA 
GAGACTTTGA 
GGCTC GGGCC 
TCTTC TGCCG 
GAGGAGGAGA 
TTAACAGTTA 
GCTCTATACA 
AAGCTTCTGC 
CGCCGATGGG 
CCCAAGTTTG 
GATGTCCTCA 
ATCGCCAAGA 



41 

I 

GGAATGACAC 
TGTCTTACAG 
GTGTCTTCTT 
AGAGCCAGCA 
ACACCAAGGA 
AAGGGAAGTA 
CCCAGCACTG 
ACTTCGCCCT 
CCAAAGGTGC 
AGGTGGTGAG 
TAGCAGCTTG 
GCTATTTTTT 
ACAACAACCA 
AAGCAAAGCT 
ACTATGGTGG 
AAGCCATCAA 
AGATCGCTGA 
TCAAGGAGAA 
CTGAGAGTTG 
TTAAAATGGA 
AAGCCTTCAG 
TGGAGTGGAA 
AGTCTGCTGA 
TCCGCCTCTT 
CTGAACTCTT 
ATTCCTATAA 



51 

i 

TCTGGACAGC 
TGAAAGCGAC 
TACCAAAGAT 
CATGGAAGGC 
ATTTCCTACC 
TATACGTCTG 
GCACCTGAAA 
GAAGCCGCGC 
TTGGATTCTC 
AGATAACACC 
GGGCATGGTC 
AGCCCAGTAC 
CACACATTTG 
CCGGAATCAG 
CAAGATCCCC 
TACCTCCATC 
TGTGATCGCT 
GCTGGTGCGC 
GATCAAATGG 
AGAAGCTGGG 
CACCAGTGAG 
CCAGCTGGAC 
CCTTCAAGAA 
TCTGGAGAAT 
CTCCAACCAC 
TGATGCCCTC 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 



60 
120 
180 
240 



SEQ ID NO:37 PBH1 DNA SEQUENCE 

Nucleic Acid Accession #: XM J) 17718 

Coding sequence: 1 -331 5 (underlined sequences correspond to start and stop codons) 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 



314 



CTCACGTTTG TCTGGAAACT GGTTGCGAAC TTCCGAAGAG GCTTC CGGAA GGAAGACAGA 1620 

AATGGCCGGG ACGAGATGGA CATAGAACTC CACGACGTGT CTCCTATTAC TCGGCACCCC 1680 

CTGCAAGCTC TCTTCATCTG GGCCATTCTT CAGAATAAGA AGGAAC TCTC CAAAGTCATT 1740 

TGGGAGCAGA CCAGGGGCTG CACTCTGGCA GCCCTGGGAG CCAGCAAGCT TCTGAAGACT 1800 

CTGGCCAAAG TGAAGAACGA CATCAATGCT GCTGGGGAGT CCGAGGAGCT GGCTAATGAG 1860 

TACGAGACCC GGGCTGTTGA GCTGTTCACT GAGTGTTACA GCAGCGATGA AGACTTGGCA 1920 

GAACAGCTGC TGGTCTATTC CTGTGAAGCT TGGGGTGGAA GCAACTGTCT GGAGCTGGCG 1980 

GTGGAGGCCA CAGACCAGCA TTTCATCGCC CAGCCTGGGG TCCAGAATTT TCTTTCTAAG 2040 

CAATGGTATG GAGAGATTTC CCGAGACACC AAGAACTGGA AGATTATCCT GTGTCTGTTT 2100 

ATTATACCCT TGGTGGGCTG TGGCTTTGTA TCATTTAGGA AGAAACCTGT CGACAAGCAC 2160 

AAGAAGCTGC TTTGGTACTA TGTGGCGTTC TTCACCTCCC CCTTCGTGGT CTTCTCCTGG 2220 

AATGTGGTCT TCTACATCGC CTTCCTCCTG CTGTTTGCCT ACGTGCTGCT CATGGATTTC 2280 

CATTCGGTGC CACACCCCCC CGAGCTGGTC CTGTACTCGC TGGTCTTTGT CCTCTTCTGT 2340 

GATGAAGTGA GACAGTGGTA CGTAAATGGG GTGAATTATT TTACTGACCT GTGGAATGTG 2400 

ATGGACACGC TGGGGCTTTT TTACTTCATA GCAGGAATTG TATTTCGGCT CCACTCTTCT 2460 

AATAAAAGCT CTTTGTATTC TGGACGAGTC ATTTTCTGTC TGGACTACAT TATTTTCACT 252 0 

CTAAGATTGA TCCACATTTT TACTGTAAGC AGAAACTTAG GACCCAAGAT TATAATGCTG 2580 

CAGAGGATGC TGATCGATGT GTTCTTCTTC CTGTTCCTCT TTGCGGTGTG GATGGTGGCC 2640 

TTTGGCGTGG CCAGGCAAGG GATCCTTAGG CAGAATGAGC AGCGCTGGAG GTGGATATTC 2700 

CGTTCGGTCA TCTACGAGCC CTACCTGGCC ATGTTCGGCC AGGTGCCCAG TGACGTGGAT 27 60 

GGTACCACGT ATGACTTTGC CCACTGCACC TTCACTGGGA ATGAGTCCAA GCCACTGTGT 2820 

GTGGAGCTGG ATGAGCACAA CCTGCCCCGG TTCCCCGAGT GGATCACCAT CCCCCTGGTG 2880 

TGCATCTACA TGTTATC C AC CAACATCCTG CTGGTCAACC TGCTGGTCGC CATGTTTGGC 2940 

TACACGGTGG GCACCGTCCA GGAGAACAAT GACCAGGTCT GGAAGTTC C A GAGGTACTTC 3000 

CTGGTGCAGG AGTACTGCAG CCGCCTCAAT ATCCCCTTCC CCTTCATCGT CTTCGCTTAC 3060 

TTCTACATGG TGGTGAAGAA GTGCTTCAAG TGTTGCTGCA AGGAGAAAAA CATGGAGTCT 3120 

TCTGTCTGCT GTTTCAAAAA TGAAGACAAT GAGACTCTGG CATGGGAGGG TGTCATGAAG 3180 

GAAAACTACC TTGTCAAGAT CAACACAAAA GCCAACGACA CCTCAGAGGA AATGAGGCAT 3240 

CGATTTAGAC AACTGGATAC AAAGCTTAAT GATCTCAAGG GTCTTCTGAA AGAGATTGCT 3300 
AATAAAATCA AA TGA 

SEQ ID NO:38 PBH1 Protein sequence 
Protein Accession #; XP017718 

1 11 21 31 41 51 

I f I I I I 

MSFRAARLSM RNRRNDTLDS TRTLYSSASR STDLSYSESD LVNFIQANFK KRECVFFTKD 60 

SKATENVCKC GYAQSQHMEG TQINQSEKWN YKKHTKEF PT DAFGDIQFET LGKKGKYIRL 120 

SCDTDAEILY ELLTQHWHLK TPNLVISVTG GAKNFALKPR MRKIFSRLIY IAQSKGAWIL 180 

TGGTHYGLMK Y I GEWRDNT ISRSSEENIV AIGIAAWGMV SNRDTLIRNC DAEGYFLAQY 240 

LMDDFTRDPL YILDNNHTHL LLVDNGCHGH PTVEAKLRNQ LEKYISERTI QDSNYGGKIP 300 

IVCFAQGGGK ETLKAINTSI KNKIPCVWE GSGQIADVIA S LVEVED ALT SSAVKEKLVR 3 60 

FLPRTVSRLP EEETESWIKW LKEILECSHL LTVIKMEEAG DEIVSNAISY ALYKAFSTSE 420 

QDKDNWNGQL KLLLEWNQLD LANDEIFTND RRWESADLQE YMFTAL I KDR PKFVRLFLEN 480 

GLNLRKFLTH DVLTELFSNH FSTLVYRNLQ IAKNSYNDAL LTFVWKLVAN FRRGFRKEDR 540 

NGRDEMDIEL HDVSPITRHP LQALFIWAIL QNKKELSKVI WEQTRGCTLA ALGASKLLKT 600 

LAKVKNDINA AGE S EE LANE YETRAVELFT ECYSSDEDLA EQLLVYSCEA WGGSNCLELA 660 

VEATDQHF I A QPGVQKFLSK QWYGEISRDT KNWKIILCLF IIPLVGCGFV SFRKKPVDKH 720 

KKLLWYYVAF FTSPFWFSW NWFYIAFLL LFAYVLLMDF HSVPHPPELV LYSLVFVLFC 780 

DEVRQWYVNG VNYFTDLWNV MDTLGLFYFI AGIVFRLHSS NKSSLYSGRV IFCLDYIIFT 840 

LRLIHIFTVS RNLGPKIIML QRMLIDVFFF LFLFAVWMVA FGVARQGILR QNEQRWRWIF 900 

RSVIYEPYLA MFGQVPSDVD GTTYDFAHCT FTGNESKPLC VELDEHNLPR FPEWITIPLV 960 

CIYMLSTNIL LVNLLVAMFG YTVGTVQENN DQVWKFQRYF LVQEYCSRLN IPFPFIVFAY 102 0 

FYMWKKCFK CCCKEKNMES SVCCFKNEDN ETLAWEGVMK ENYLVKINTK ANDTSEEMRH 1080 
RFRQLDTKLN DLKGLLKEIA NKIK 

SEQ ID NO:39 PBH3 DNA SEQUENCE 

Nucleic Acid Accession #: XM_01 1804 

Coding sequence: 1-558 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I i 

ATG CCTCGCC TGTTCTTGTT CCACCTGCTA GAATTCTGTT TACTACTGAA CCAATTTTCC 60 

AGAGCAGTCG CGGCCAAATG GAAGGACGAT GTTATTAAAT TATGCGGCCG CGAATTAGTT 120 

CGCGCGCAGA TTGCCATTTG CGGCATGAGC ACCTGGAGCA AAAGGTCTCT GAGCCAGGAA 180 

GATGCTCCTC AGACACCTAG ACCAGTGGCA GAAATTGTAC CATCCTTCAT CAACAAAGAT 240 

ACAGAAACTA TAATTATCAT GTTGGAATTC ATTGCTAATT TGCCACCGGA GCTGAAGGCA 300 

GCCCTATCTG AGAGGCAACC ATCATTACCA GAGCTACAGC AGTATGTACC TGCATTAAAG 360 

GATTCCAATC TTAGCTTTGA AGAATTTAAG AAACTTATTC GCAATAGGCA AAGTGAAGCC 420 

GCAGACAGCA ATCCTTCAGA ATTAAAATAC TTAGGCTTGG ATACTCATTC TCAAAAAAAG 480 

AGACGACCCT ACGTGGCACT GTTTGAGAAA TGTTGC CTAA TTGGTTGTAC CAAAAGGTCT 540 
CTTGCTAAAT ATTGCTGA 

SEP ID NO:40 PBH3 PROTEIN SEQUENCE 

Protein Accession*: NP_008842 

1 11 21 31 41 51 

I I ! I I I 

MPRLFLFHLL EFCLLLNQFS RAVAAKWKDD VIKLCGRELV RAQIAICGMS TWSKRSLSQE 60 
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DAPQTPRPVA EIVPSFINKD TETIIIMLEF IANLPPELKA ALSERQPSLP ELQQYVPALK 120 

DSNLSFEEFK KLIRNRQSEA ADSNPSELKY LGLDTHSQKK RRPYVALFEK CCLIGCTKRS 180 

LAKYC 

SEQ ID NO:41 PBH5 DNA SEQUENCE 

Nucleic Actd Accession #: NM_005845 

Coding sequence: 1-3978 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

| I I I 1 i 

ATGCTGCCCG TGTACCAGGA GGTGAAGCCC AACCCGCTGC AGGACGCGAA CCTCTGCTCA 60 

CGCGTGTTCT TCTGGTGGCT CAATCCCTTG TTTAAAATTG GCCATAAACG GAGATTAGAG 120 

GAAGATGATA TGTATTCAGT GCTGCCAGAA GACCGCTCAC AGCACCTTGG AGAGGAGTTG 180 

CAAGGGTTCT GGGATAAAGA AGTTTTAAGA GCTGAGAATG ACGCACAGAA GCCTTCTTTA 240 

ACAAGAGCAA TCATAAAGTG TTACTGGAAA TCTTATTTAG TTTTGGGAAT TTTTACGTTA 3 00 

ATTGAGGAAA GTGCCAAAGT AATCCAGCCC ATATTTTTGG GAAAAATTAT TAATTATTTT 360 

GAAAATTATG ATCCCATGGA TTCTGTGGCT TTGAACACAG CGTACGCCTA TGCCACGGTG 420 

CTGAC TTTTT GCACGC TCAT TTTGGCTATA CTGCATCACT TATATTTTTA TCACGTTCAG 480 

TGTGCTGGGA TGAGGTTACG AGTAGCCATG TGCCATATGA TTTATCGGAA GGCACTTCGT 540 

CTTAGTAACA TGGCCATGGG GAAGACAACC ACAGGCCAGA TAGTCAATCT GCTGTCCAAT 600 

GATGTGAACA AGTTTGATCA GGTGACAGTG TTCTTAC AC T TCCTGTGGGC AGGACCACTG 660 

CAGGCGATCG CAGTGACTGC CCTACTCTGG ATGGAGATAG GAATATCGTG CCTTGCTGGG 720 

ATGGCAGTTC TAATCATTCT CCTGCCCTTG CAAAGCTGTT TTGGGAAGTT GTTCTCATCA 780 

CTGAGGAGTA AAACTGCAAC TTTCACGGAT GCCAGGATCA GGACCATGAA TGAAGTTATA 840 

ACTGGTATAA GGATAATAAA AATGTACGCC TGGGAAAAGT CATTTTCAAA TCTTATTACC 900 

AATTTGAGAA AGAAGGAGAT TTCCAAGATT CTGAGAAGTT CCTGCCTCAG GGGGATGAAT 960 

TTGGCTTCGT TTTTCAGTGC AAGCAAAATC ATCGTGTTTG TGACCTTCAC CACCTACGTG 1020 

CTCCTCGGCA GTGTGATCAC AGCCAGCCGC GTGTTCGTGG CAGTGACGCT GTATGGGGCT 1080 

GTGCGGCTGA CGGTTACCCT CTTCTTCCCC TCAGCCATTG AGAGGGTGTC AGAGGCAATC 1140 

GTCAGCATCC GAAGAATCCA GACCTTTTTG CTACTTGATG AGATATCACA GCGCAACCGT 1200 

CAGCTGCCGT CAGATGGTAA AAAGATGGTG CATGTGCAGG ATTTTACTGC TTTTTGGGAT 1260 

AAGGCATCAG AGACCCCAAC TCTACAAGGC CTTTCCTTTA CTGTCAGACC TGGCGAATTG 1320 

TTAGCTGTGG TCGGCCCCGT GGGAGCAGGG AAGTCATCAC TGTTAAGTGC CGTGCTCGGG 1380 

GAATTGGCCC CAAGTCACGG GCTGGTCAGC GTGCATGGAA GAATTGCCTA TGTGTCTCAG 1440 

CAGCCCTGGG TGTTCTCGGG AACTCTGAGG AGTAATATTT TATTTGGGAA GAAATACGAA 1500 

AAGGAACGAT ATGAAAAAGT CATAAAGGCT TGTGCTCTGA AAAAGGATTT ACAGCTGTTG 1560 

GAGGATGGTG ATCTGACTGT GATAGGAGAT CGGGGAACCA CGCTGAGTGG AGGGCAGAAA 162 0 

GCACGGGTAA ACCTTGCAAG AGCAGTGTAT CAAGATGCTG ACATCTATCT CCTGGACGAT 1680 

CCTCTCAGTG CAGTAGATGC GGAAGTTAGC AGACACTTGT TCGAACTGTG TATTTGTCAA 17 4 0 

ATTTTGCATG AGAAGATCAC AATTTTAGTG ACTCATCAGT TGCAGTACCT CAAAGCTGCA 1800 

AGTCAGATTC TGATATTGAA AGATGGTAAA ATGGTGCAGA AGGGGACTTA CACTGAGTTC 186 0 

CTAAAATCTG GTATAGATTT TGGCTCCCTT TTAAAGAAGG ATAATGAGGA AAGTGAACAA 1920 

CCTCCAGTTC CAGGAACTCC CACACTAAGG AATCGTACCT TCTCAGAGTC TTCGGTTTGG 1980 

TCTCAACAAT CTTCTAGACC CTCCTTGAAA GATGGTGCTC TGGAGAGCCA AGATACAGAG 2040 

AATGTCCCAG TTACACTATC AGAGGAGAAC CGTTCTGAAG GAAAAGTTGG TTTTCAGGCC 2100 

TATAAGAATT ACTTCAGAGC TGGTGCTCAC TGGATTGTCT TCATTTTCCT TATTCTCCTA 2160 

AACACTGCAG CTCAGGTTGC CTATGTGCTT CAAGATTGGT GGCTTTCATA CTGGGCAAAC 2220 

AAACAAAGTA TGCTAAATGT CACTGTAAAT GGAGGAGGAA ATGTAACCGA GAAGCTAGAT 2280 

CTTAACTGGT ACTTAGGAAT TTATTCAGGT TTAACTGTAG CTACCGTTCT TTTTGGCATA 2340 

GCAAGATCTC TATTGGTATT CTACGTCCTT GTTAACTCTT CACAAACTTT GCACAACAAA 2400 

ATGTTTGAGT CAATTCTGAA AGCTCCGGTA TTATTCTTTG ATAGAAATCC AATAGGAAGA 2460 

ATTTTAAATC GTTTCTCCAA AGACATTGGA CACTTGGATG ATTTGCTGCC GCTGACGTTT 2 520 

TTAGATTTCA TCCAGACATT GCTACAAGTG GTTGGTGTGG TCTCTGTGGC TGTGGCCGTG 2580 

ATTCCTTGGA TCGCAATACC CTTGGTTCCC CTTGGAATCA TTTTCATTTT TCTTCGGCGA 2640 

TATTTTTTGG AAACGTCAAG AGATGTGAAG CGCCTGGAAT CTACAACTCG GAGTCCAGTG 2700 

TTTTCCCACT TGTCATCTTC TCTCCAGGGG CTCTGGACCA TCCGGGCATA CAAAGCAGAA 2760 

GAGAGGTGTC AGGAACTGTT TGATGCACAC CAGGATTTAC ATTCAGAGGC TTGGTTCTTG 2 820 

TTTTTGACAA CGTCCCGCTG GTTCGCCGTC CGTCTGGATG CCATCTGTGC CATGTTTGTC 2880 

ATCATCGTTG CCTTTGGGTC CCTGATTCTG GCAAAAACTC TGGATGCCGG GCAGGTTGGT 2940 

TTGGCACTGT CCTATGCCCT CACGCTCATG GGGATGTTTC AGTGGTGTGT TCGACAAAGT 3000 

GCTGAAGTTG AGAATATGAT GATCTCAGTA GAAAGGGTCA TTGAATACAC AGACCTTGAA 3060 

AAAGAAGCAC CTTGGGAATA TCAGAAACGC CCACCACCAG CCTGGCCCCA TGAAGGAGTG 3120 

ATAATCTTTG ACAATGTGAA CTTCATGTAC AGTCCAGGTG GGCC TCTGGT ACTGAAGCAT 3180 

CTGACAGCAC TCATTAAATC ACAAGAAAAG GTTGGCATTG TGGGAAGAAC CGGAGCTGGA 3240 

AAAAGTTCCC TCATCTCAGC CCTTTTTAGA TTGTCAGAAC CCGAAGGTAA AATTTG G ATT 3300 

GATAAGATCT TGACAACTGA AATTGGACTT CACGATTTAA GGAAGAAAAT GTCAATCATA 3360 

CCTCAGGAAC CTGTTTTGTT CACTGGAACA ATGAGGAAAA ACCTGGATCC CTTTAATGAG 3420 

CACACGGATG AGGAACTGTG GAATGCCTTA CAAGAGGTAC AACTTAAAGA AACCATTGAA 3480 

GATCTTCCTG GTAAAATGGA TACTGAATTA GCAGAATCAG GATCCAATTT TAGTGTTGGA 3540 

CAAAGACAAC TGGTGTGCCT TGCCAGGGCA ATTCTCAGGA AAAATCAGAT ATTGATTATT 3 600 

GATGAAGCGA CGGCAAATGT GGATCCAAGA ACTGATGAGT TAATACAAAA AAAAATC CGG 3660 

GAGAAATTTG CCCACTGCAC CGTGCTAACC ATTGCACACA GATTGAACAC CATTATTGAC 3720 

AGCGACAAGA TAATGGTTTT AGATTCAGGA AGACTGAAAG AATATGATGA GCCGTATGTT 3780 

TTGCTGCAAA ATAAAGAGAG CCTATTTTAC AAGATGGTGC AACAACTGGG CAAGGCAGAA 3 840 

GCCGCTGCCC TCACTGAAAC AGCAAAACAG GTATACTTCA AAAGAAATTA TCCACATATT 3900 

GGTCACACTG ACCACATGGT TACAAACACT TCCAATGGAC AGCCCTCGAC CTTAACTATT 3960 
TTCGAGACAG CACTGTGA 
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SEP ID NO:42 PBH5 PROTEIN SEQUENCE 

Protein Accession #: NP_005836 

- 1 11 21 31 41 51 

5 t I I i I I 

MLPVYQEVKP NPLQDANLCS RVFFWWLNPL FKIGHKRRLE EDDMYSVLPE DRSQHLGEEL 60 

QGFWDKEVLR AENDAQKPSL TRAIIKCYWK SYLVLGIFTL IEESAKVIQP IFLGKIINYF 120 

ENYDPMDSVA LNTAYAYATV LTFCTLILAI LHHLYFYHVQ CAGMRLRVAM CHMIYRKALR 180 

LSNMAMGKTT TGQIVNLLSN DVNKFDQVTV FLHFLWAGPL QAIAVTALLW MEIGISCLAG 240 

10 MAVLIILLPL QSCFGKLFSS LRSKTATFTD ARIRTMNEVI TGIRIIKMYA WEKSFSNLIT 300 

NLRKKEISKI LRSSCLRGMN LASFFSASKI IVFVTFTTYV LLGSVITASR VFVAVTLYGA 360 

VRLTVTLFFP SAIERVSEAI VSIRRIQTFL LLDEISQRNR QLPSDGKKMV HVQDFTAFWD 420 

KASETPTLQG LSFTVRPGEL LAWGPVGAG KSSLLSAVLG ELAPSHGLVS VHGRIAYVSQ 480 

QPWVFSGTLR SNILFGKKYE KERYEKVIKA CALKKDLQLL EDGDLTVIGD RGTTLSGGQK 540 

15 ARVNLARAVY QDADIYLLDD PLSAVDAEVS RHLFELCICQ ILHEKITILV THQLQYLKAA 600 

SQILILKDGK MVQKGTYTEF LKSGIDFGSL LKKDNEESEQ PPVPGTPTLR NRTFSESSVW 660 

SQQSSRPSLK DGALESQDTE NVPVTLSEEN RSEGKVGFQA YKNYFRAGAH WIYFIFLILL 720 

NTAAQVAYVL QDWWLSYWAN KQSMLNVTVN GGGNVTEKLD LNWYLGIYSG LTVATVLFGI 7 80 

ARSLLVFYVL VNSSQTLHNK MFESILKAPV LFFDRNPIGR ILNRFSKDIG HLDDLLPLTF 840 

20 LDFIQTLLQV VGWSVAVAV IPWIAIPLVP LGIIFIFLRR YFLETSRDVK RLESTTRSPV 900 

FSHLSSSLQG LWTIRAYKAE ERCQELFDAH QDLHSEAWFL FLTTSRWFAV RLDAI CAMFV 960 

IIVAFGSLIL AKTLDAGQVG LALSYALTLM GMFQWCVRQS AEVENMMISV ERVIEYTDLE 1020 

KEAPWEYQKR PPPAWPHEGV IIFDNVNFMY SPGGPLVLKH LTALIKSQEK VG I VGRTG AG 1080 

KSSLISALFR LSEPEGKIWI DKILTTEIGL HDLRKKMSII PQEPVLFTGT MRKNLDPFNE 1140 

25 HTDEELWNAL QEVQLKETIE DLPGKMDTEL AESGSNFSVG QRQLVC LARA ILRKNQILII 1200 

iJ DEATANVDPR TDELIQKKIR EKFAHCTVLT IAHRLNTIID SDKIMVLDSG RLKEYDEPYV 1260 

-. f\ LLQNKESLFY KWQQLGKAE AAALTETAKQ VYFKRNYPHI GHTDHMVTNT SNGQPSTLTI 1320 
FETAL 

3t) SEQ ID NO:43 PBQ7 DNA SEQUENCE 

^ 2 Nucleic Acid Accession #: NM_021233 

: : 0 Coding sequence: 34-1 119 (underlined sequences correspond to start and stop codons) 

^ 1 11 21 31 41 51 

;35 ; | | i | I 

ATGGGGAAAG TGTCCTGCTG TGGCATGAAA TA AATGA AAC AGAAAATGAT GGCAAGACTG 60 

CTAAGAACAT CCTTTGCTTT GCTCTTCCTT GGCCTCTTTG GGGTGCTGGG GGCAGCAACA 120 

ATTTCATGCA GAAATGAAGA AGGGAAAGCT GTGGACTGGT TTACTTTTTA TAAGTTACCT 180 

* AAAAGACAAA ACAAGGAAAG TGGAGAGACT GGGTTAGAGT ACCTGTACCT AGACTCTACA 240 

M) ACTAGAAGCT GGAGGAAGAG TGAGCAACTA ATGAATGACA CCAAGAGTGT TTTGGGAAGG 300 

ACATTACAAC AGCTATATGA AGCATATGCC TCTAAGAGTA ACAACACAGC CTATCTAATA 360 

TACAATGATG GAGTCCCTAA ACCTGTGAAT TACAGTAGAA AGTATGGACA CACCAAAGGT 420 

%^ TTACTGCTGT GG AACAGAGT TCAAGGGTTC TGGCTGATTC ATTCCATCCC TCAGTTTCCT 4 80 

'I," CCAATTCCGG AAGAAGGCTA TGATTATCCA CCCACAGGGA GACGAAATGG ACAAAGTGGC 540 

;45 ATCTGCATAA CTTTCAAGTA CAACCAGTAT GAGGCAATAG ATTC TCAGCT CTTGGTCTGC 600 

AACCCCAACG TCTATAGCTG CTCCATCCCA GCCACCTTTC ACCAGGAGCT CATTCACATG 660 

CCCCAGCTGT GCACCAGGGC CAGCTCATCA GAGATTCCTG GCAGGCTCCT CACCACACTT 720 

CAGTCGGCCC AGGGACAAAA ATTCCTCCAT TTTGCAAAGT CGGATTCTTT TCTTGACGAC 780 

ATCTTTGCAG CCTGGATGGC TCAACGGCTG AAGACACACT TGTTAACAGA AACC TGGC AG 840 

50 CGAAAAAGAC AAGAGCTTCC TTCAAACTGC TCCCTTCCTT ACCATGTCTA CAATATAAAA 900 

GCAATTAAAT TATCACGACA CTCTTATTTC AGTTCTTATC AAGATCACGC CAAGTGGTGT 960 

ATTTCCCAAA AGGGCACCAA AAATCGCTGG ACATGTATTG GAGACCTAAA TCGGAGTCCA 1020 

CACCAAGCCT TCAGAAGTGG AGGATTCATT TGTACCCAGA ATTGGCAAAT TTACCAAGCA 1080 
TTTCAAGGAT TAGTATTATA CTATGAAAGC TGTAAGTAAA CTTGGTGAAA GGACACAGGT 



55 



60 



65 



SEQ 10 NO:44 PBQ7 Protein sequence 
Protein Accession #: NP.067056 

1 11 21 31 41 51 

ill!!! 

MMARLLRTSF ALLFLGLFGV LGAATI SCRN EEGKAVDWFT FYKLPKRQNK ESGETGLEYL 60 

YLDSTTRSWR KSEQLMNDTK SVLGRTLQQL YEAYASKSNN TAYL I YNDGV PKPVNYSRKY 120 

GHTKGLLLWN RVOGFWLIHS IPQFPPIPEE GYDYPPTGRR KGQSGICITF KYNQYEAIDS 180 

QLLVCNPNVY SCSIPATFHQ ELIHMPQLCT RASSSEIPGR LLTTLQSAQG QKFLHFAKSD 240 

SFLDDIFAAW MAQRLKTHLL TETWQRKRQE LPSNCSLPYH VYNIKAIKLS RHSYFSSYQD 300 
HAKWCISQKG TKNRWTCIGD LNRSPHQAFR SGGFICTQNW QIYQAFQGLV LYYESCK 



SEQ ID NO:45 PCQ8 DNA SEQUENCE 

Nucleic Acid Accession #: XM_030453 
70 Coding sequence: 89-1 273 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

CGGTGCCCTG GGGTGGAATA TCCCCTACGA ATTTAACCAA GCGGACTTTA ATGCCACTGT 60 

75 GCAGTTCATC CAAAACCACT TGGATGACAT GGATGTCAAA AAGGGTGTCT CCTGGACCAC 120 

CATCCGCTAC ATGATAGGAG AGATTCAATA TGGAGGCAGA GTCACTGACG ACTATGATAA 180 

GAGATTGTTG AACACATTTG CTAAGGTTTG GTTCAGTGAA AATATGTTTG GACCAGATTT 240 

CAGTTTTTAC CAAGGATACA ATATTCCAAA ATGCAGCACA GTGGATAACT ATCTTCAGTA 300 

TATCCAGAGT TTGCCTGCCT ATGACAGCCC TGAGGTGTTT GGGCTGCACC CCAATGCTGA 360 
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CATCACCTAC CAGAGCAAGC TGGCCAAGGA CGTGCTGGAC ACCATCCTAG GCATCCAACC 
CAAGGACACC TCTGGTGGAG GGGATGAGAC CCGGGAGGCG GTGGTGGCCC GGCTGGCTGA 
TGATATGCTG GAGAAGCTGC CCCCAGACTA TGTCCCCTTT GAAGTAAAAG AGAGGCTGCA 
GAAGATGGGG CCATTCCAGC CTATGAACAT TTTCCTCAGG CAGGAAATAG ACAGAATGCA 
5 AAGGGTACTC AGCCTTGTCC GCAGCACCCT CACTGAGCTG AAACTTGCTA TTGATGGCAC 

CATCATCATG AGCGAAAATC TGCAAGATGC ATTGGATTGC ATGTTTGATG CTAGAATCCC 
TGCTTGGTGG AAAAAAGCTT CTTGGGTTTT TAGTACACTG GGTTTCTGGT TTACTGAACT 
TATAGAAAGA AACAGCCAGT TTACCTCGTG GGTTTTCAAT GGCCGACCTC ACTGCTTTTG 
GATGACGGGT TTTTTTAACC CCCAGGGATT TTTAACTGCA ATGCGACAGG AAATAACTCG 
10 GGCCAACAAA GGCTGGGCTC TGGACAATAT GGTGCTTTGC AATGAAGTCA CCAAATGGAT 

GAAGGACGAC ATTTCTACCC CTCCCACAGA GGGTGTCTAT GTC TATGGCT TATATCTTGA 
AGGTGCTGGC TGGGACAAGA GGAACATGAA ACTCATTGAA TCAAAGCCAA AAGTGCTCTT 
TGAGTTGATG CCTGTCATAA GGATTTATGC AGAAAACAAT ACTTTACGAG ATCCTCGGTT 
TTACTCCTGT CCCATCTATA AGAAGCCAGT TCGAACGGAC TTGAACTACA TTGCCGCTGT 
15 GGATCTCAGG ACAGCCCAGA CCCCTGAACA CTGGGTGCTC CGTGGGGTTG CCCTTCTGTG 

TGATGTCAAG TAACATGTGG GGAGTGTCCC CACCCAATGC TTTGGAAAAT GCAAGATCTA 
AATTATTGTA ACCTTTATTT CTGTATGACT GCTGGACAGT GTATGTTAGG TCGTTTATGC 
AATTAATGAG CTGCATAGGT TTTCCCCACT CCTTAATTGG ATGCTTATAT TTTACTTGTT 
TCATCATTAG TGACCAATGT CTGAGTTTGT TGAAAATGTT ATTTAGTGAT ATAAAAGTAA 
20 ATTTACAGCA TCCTAATGAA GTGTGGCCCT CAAATCCACA GTAGTATATT TTCTTCTTAC 

TTCGCTCCGA AGACTGACTG TGATTATAAC AGCAAATATA TTTGCATGTG GACAAAGATT 
AGATGGCAAG ATAGAAAAAT AAGAACAGAT GTGATAGCAA GAATTATAGT TGGCTTGAAA 
AAATGTGATG ATCAGGAGAA AAAATAAAAA AAGGGTAGAA AT ATT AG AC G GTGCGTAGGG 
ACTTTCTATG GAC TTTTATT AATTAGGAAA CATTATCAAA GGAACTTTTC ACGTATTTTT 
25 CTTTAAATTC TGGTTAGATG TTATTAATAA TTCTTCATCT AACCTACTGA CTAGAAAATA 

J. TAGTCAGTAC TAAATTAGAA TTGTGGTTTA TAAACTTTTG GTTAGCTCTG GATCTGTATA 

™ ACTGCATTTT TTTGGATAAA CAGTTTTTGG TAGGTGGATA CCGGGAGACA AGTGTGGGTC 

3 CCTCTCACTG GGCTTCATTC TGTGGACCAG GATCATTATT TCATGCTCAT GATCATGAGA 

r* GTTAGGACTG AGTGGCTCCT GTGACTCCCA CCATCTTAGA TGATACTGTT TTCTTGTGAG 

"SO TTCTTTCTTT TGGTGTGGAT TAGTATATCA GTTGATTTGT GTGAATTGTG GTGAAACAAT 
S CATTTCATTT TGAAAAGCAA GTAATGAAAA TGTCAGCATC ATAGGAATTA ATAAAATGTT 

r _. TTTACTAAAA AAAAAAAAAA AAA 

SEQ ID NO:46 PCQ8 Protein sequence 
=3 5 Protein Accession #: BAB 1 5543 

?>■ i 11 21 31 41 51 

I i I I ! I 

MDVKKGVSWT TIRYMIGEIQ YGGRVTDDYD KRLLNTFAKV WFSENMFGPD FSFYQGYNIP 
40 KCSTVDNYLQ YIQSLPAYDS PEVFGLHPNA DITYQSKLAK DVLDTILGIQ PKDTSGGGDE 

TREAWARLA DDMLEKLPPD YVPFEVKERL QKMGPFQPMN IFLRQEIDRM QRVLSLVRST 
3 LTELKLAIDG TIIMSENLQD ALDCMFDARI PAWWKKASWV FSTLGFWFTE LIERNSQFTS 

■ WVFNGRPHCF WMTGFFNPQG FLTAMRQEIT RANKGWALDN MVLCNEVTKW MKDDISTPPT 

^ EGVYVYGLYL EGAGWDKRNM KLIESKPKVL FELMPVIRIY AENNTLRDPR FYSCPIYKKP 

L.fl-5 VRTDLNYIAA VDLRTAQTPE HWVLRGVALL CDVK 

J SEQ ID NO:47 PDG5 DNA SEQUENCE 

,5. Nucleic Acid Accession*: AB033O36 

Coding sequence: 68-3349 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

| | I I 1 I 

GGAGCAGCCT ACAACTTCAC AACCAGAAAC CACTACCCCT CAGGGGTTGC TTTCAGATAA 60 

AGATGACATG GGAAGGAGAA ATGCTGGCAT AGATTTCGGA TCCAGAAAAG CATCAGCAGC 120 

55 ACAGCCCATA CCTGAAAACA TGGACAATTC CATGGTTAGT GATCCACAAC CATACCATGA 180 

AGATGCAGCT TCTGGAGCTG AGAAGACAGA AGCCAGAGCT TCTCTCTCAC TGATGGTGGA 240 

AAGCCTTTCT ACAACCCAAG AGGAGGCCAT TCTCTCAGTA GCAGCAGAGG CTCAGGTGTT 300 

TATGAATCCT TCTCATATCC AGTTAGAAGA TCAAGAAGCT TTCAGCTTTG ATTTACAAAA 360 

GGCCCAATCC AAAATGGAGT CAGCCCAGGA TGTTCAAACT ATCTGCAAAG AAAAGCCTTC 420 

60 TGGAAATGTT CACCAGACCT TTACAGCAAG TGTTTTGGGT ATGACAAGTA CTACAGCCAA 480 

AGGAGATGTT TATGCCAAGA CTCTGCCTCC CAGAAGCCTT TTTCAGTCCT CAAGGAAGCC 540 

TGATGCTGAA GAAGTCTCCT CAGATTCAGA GAATATTCCT GAGGAGGGGG ATGGTTCTGA 60 0 

AGAACTGGCT CATGGTCACT CTTCCCAGTC CTTGGGGAAG TTTGAAGATG AACAAGAAGT 660 

CTTCTCAGAA TCAAAAAGTT TTGTTGAGGA CTTGAGCAGC TCTGAGGAGG AGCTGGACCT 720 

65 CAGATGCCTC TCCCAGGCTT TAGAGGAGCC TGAAGATGCA GAAGTCTTCA CAGAATCAAG 780 

CAGTTATGTT GAAAAGTACA ACACTTCTGA TGATTGCAGC AGCTCAGAGG AAGACCTGCC 840 

TCTCAGACAC CCTGCTCAGG C CTTGGG AAA GCCCAAAAAC CAACAAGAAG TCTCCTCTGC 900 

TTCAAATAAT ACTCCTGAAG AGCAGAATGA TTTTATGCAG CAGCTGCCTT CCAGATGCCC 960 

TTCTCAGCCC ATTATGAATC CTACTGTTCA GCAACAAGTC CCCACCAGTT CAGTGGGCAC 1020 

70 TTCTATAAAA CAGAGCGATT CCGTGGAGCC AATCCCTCCA AGACACCCTT TCCAGCCATG 1080 

GGTGAACCCT AAAGTGGAGC AAGAAGTTTC CTCATCTCCA AAGAGCATGG CTGTTGAAGA 1140 

GAGCATTTCT ATGAAGCCTC TGCCTCCTAA ACTTCTTTGC CAGCCCTTGA TGAATCCTAA 1200 

AGTTCAACAA AACATGTTCT CAGGTTCAGA GGACATTGCT GTTGAGAGAG TCATTTCTGT 1260 

GGAGCCACTA CTCCCCAGAT ATTCTCCTCA GTCCTTGACA GATCCTCAAA TCCGGCAAAT 1320 

75 CTCAGAAAGC ACAGCTGTTG AGGAAGGCAC TTATGTGGAA CCGCTGCCTC CCAGATGCCT 1380 

TTCCCAGCCC TCGGAGAGGC CTAAGTTCCT GGACTCAATG AGTACTTCTG CAGAATGGAG 1440 

CAGTCCTGTG GCACCAACAC CTTCCAAATA CACTTCCCCG CCATGGGTGA CCCCTAAATT 1500 

TGAGGAACTG TATCAACTCT CTGCACATCC AGAAAGCACT ACTGTTGAAG AGGACATTTC 1560 

TAAGGAGCAG CTGCTTCCCA GACATCTTTC CCAGTTGACT GTGGGAAATA AAGTCCAGCA 1620 

80 ACTGTC CTCA AATTTCGAGC GGGCTGCTAT TGAGGCAGAC ATTTCTGGGA GTCCATTGCC 1680 
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TCC CCAATAT GCTACCCAGT TCTTAAAGAG GTCTAAAGTT CAGGAAATGA CCTCACGACT 1740 

AGAGAAAATG GCTGTTGAAG GCACTTCTAA CAAATCACCG ATTCCCAGGC GTCCGACCCA 1800 

GTCATTCGTG AAATTTATGG CACAGCAAAT CTTTTCAGAG AGCTCTGCTC TTAAGAGGGG 1860 

CAGTGATGTG GCACCTCTGC CTCCCAATCT TCCTTCCAAA TCTTTATCAA AGCCTGAAGT 1920 

5 CAAGCACCAA GTTTTCTCAG ATTCAGGGAG TGCTAATCCT AAGGGAGGCA TTTCTTCAAA 1980 

GATGCTACCT ATGAAGCACC CTTTACAGTC CTTGGGGAGG CCTGAAGACC CACAGAAAGT 2040 

TTTCTCTTAT TCAGAGAGAG CTCCTGGGAA GTGCAGCAGT TTTAAAGAGC AGCTGTCTCC 2100 

CAGGCAGCTT TCCCAGGCCT TGAGGAAACC TGAGTATGAG CAAAAAGTCT CCCCTGTTTC 2160 

TGCCAGTTCT CCTAAAGAGT GGAGGAATTC TAAAAAGCAG CTGCCTCCCA AACATTCTTC 2220 

10 CCAAGCCTCA GATAGGTCTA AATTCCAGCC ACAGATGTCA TCAAAGGGCC CAGTGAATGT 2280 

ACC TGTAAAG CAGAGCAGCG GTGAGAAGCA CCTGCCTTCA AGTAGTCCTT TCCAGCAACA 2340 

GGTTCATTCA AGTTCTGTGA ATGCTGCTGC TAGGCGATCT GTTTTTGAGA GCAATTCTGA 2400 

CAATTGGTTC CTAGGAAGAG ATGAAGCTTT TGCAATCAAA ACCAAGAAAT TCAGCCAAGG 2460 

TTCCAAAAAC CCCATAAAGA GCATTCCAGC CCCTGCTACC AAACCTGGGA AGTTCACCAT 2520 

15 TGCTCCTGTC AGGCAAACAT CCACTTCTGG GGGCATTTAC TCTAAGAAAG AAGATCTTGA 2580 

GAGTGGTGAT GGTAATAATA ACCAGCATGC AAACCTATCC AATCAGGATG ATGTTGAAAA 2640 

GCTTTTTGGA GTTCGACTGA AAAGAGCCCC TCCCTCGCAG AAGTATAAGA GTGAGAAACA 2700 

AGATAACTTC ACCCAGCTTG CTTCAGTGCC CTCGGGCCCA ATTTCATCCT CTGTAGGCAG 2760 

GGGACATAAA ATCAGAAGCA CTTCCCAGGG GCTCCTGGAT GCTGCAGGGA ACCTCACCAA 2820 

20 AATATCTTAC GTTGCAGATA AGCAACAGAG CAGGCCCAAA TCTGAAAGCA TGGCCAAGAA 2880 

GCAACCTGCT TGCAAGACCC CAGGAAAGCC TGCTGGTCAA CAGTCAGATT ATGCTGTCTC 2940 

AGAGCCGGTT TGGATAACTA TGGCAAAGCA GAAGCAGAAG AGTTTCAAGG CCCACATTTC 3000 

TGTGAAAGAG CTGAAAACTA AGAGCAATGC TGGAGCCGAT GCTGAGACTA AGGAGCCTAA 3060 

ATATGAGGGA GCTGGCTCTG CAAATGAAAA CCAACCTAAA AAGATGTTCA CTTCCAGTGT 3120 

25 CCATAAACAG GAGAAGACAG CACAGATGAA GCCACCTAAG CCTACAAAAT CAGTTGGATT 3180 

! "; TGAAGCTCAG AAGATACTGC AAGTTCCTGC CATGGAAAAA GAAACCAAAC GATCTTCAAC 3240 

~[ TCTCCCAGCC AAGTTCCAGA ACCCAGTTGA GCCAATTGAG CCTGTCTGGT TCTCACTGGC 3300 

P CAGGAAGAAA GCCAAAGCAT GGAGCCACAT GGCAGAAATC ACGCA ATAA A GAGCTCTTGT 3360 

p hf\ GTGGAGCATC AGCATTTATT TTATTTAGTT TTTTTTTTTT TTTTTTTTTT GAGACAGAGT 3420 

*=30 CTCGCTCTGT TACCCAGATT GGAGTGCAGT GGCGCGATCT CCGCTCACTG CAAGCTCCGC 3480 

J CTCCCGGGTT CACGCCACTC TCCCGCCTCA GTCTCCCGAC TAGCTGGGAC TACAGGCGCC 3540 

~ CGCCATCACG CCCGGCTAAT TTTGTTTTCG TATTTTTAGT AGAGACGGGG TTTCACCATG 3600 

I i; TTGGCCAGGA TGGTCTTGAT CTCCTGACCT CGTGATCCGC CCGCCTCAGC CTCCCAAAAG 3660 

z%y C CTGGGATTAC AGGCGTGAGC CACCGCGCCC GGCCAAGCAT CAGCGTTTTA AATGATAATT 3720 

j35 GCTAATAGCT GTATTAATTC TATGTAGTGA TCTTTTTACT GTGACCACTT GTATTAAGCA 3780 

pjl AAATAAGTAT TAAGCAAACT AAGAATTTAT TAAGCAAAAT AAGAATTTAT TAAGCAAAAT 3840 

Lsj. AGCCTTAGAA ATGCAAATTA AAACATAATT ATTTGAATGA AATAAATGCC ATGAATGCTT 3900 

^ AACCTTCCAC GTAGTCACTG CCAGCACCCA GAAACCCAGC ATTTCCTCTA TTAAAACTAT 3960 

CGAAAACATT TGC AC TGCTG TAAAATTGCA AAATCTTTAA CTTTGGACAA TGTGCTTTAG 4020 

40 AAGGGAGAAA GCAAAAACAT TTTGTTGGAG CAACTAGAAA ATTGTCATTT CCCTCAACCA 4080 

^ AATAAAGTAA TTCTAATGGA AACATTCAGA TGATTTGACC TAAAGATTGG CCTTTAGGTT 4140 

TTATGAGCCT AGATAGATGC CGCAATTATT TGGTTGTTGC TCTAAGCTTT GCAAGGGATC 4200 

CTAAAAGAGG CGGTGGAAGT GAAAATTCTG GGTCTCCAAG AAAATTTCTG CACAGCCAGT 4260 

TCTCCAATCA GCCTATCACC CCTTGAAACA TCTTCCCTGT GTCCCTGGGG GCCCCTGATG 4320 

45 CTTTCTCCTT GGGTGATAGT AACATGCAGA GCACTTACAC AAAGCTCCCT CTTTGGACAT 4380 

L = : ACCCCACGTC GACCTGTCAC AGGCCTGGCT GTAGCGAGCA CCTCCCTATG ACGCAGAATG 4440 

S CTTCTTGGGA ATTATCTTAC TCCTC TGGAG GGTTAGTCCA TCAATGTTTT GCTTCTTGTC 4500 

CCAATACTAC TGTGACCCTC TCTGATCGCA CAGAAATCAC TGCCTATCAC ATATATCCTG 4560 

TTAAGCACTG AAGACCCTAT TGAAATTAGA GTTCTACAGA TGCCAAAAGC TGTACTTTCC 462 0 

50 ATCAGGCAGA TGGCAAGCTT ACTGCCTTGA TGCACATCTG GAGCCACTGG AGCTCCTTCC 4680 

TCTCTGGTTC CAGCATTAAG GTGGAGAACT CCATGTAGCT TCTTGTCCTT TCCCCTCAGC 4740 

TGTCTTTGCT TCACAAGGTT TTAGCCCAAA GCAAGAGTGC AATCCCAAAG CCACAGAGAA 4800 

ATGAACTTTC CGCTACCTGG AAGCTTTAAG TGAGTAAATC AGCTTTTCCC CTCTCATTCC 4860 

TAGAGGCACA CACCTCAAAA GTTACTAGGC TGGAGAGACC CTACCTTCCA GTGACCCACT 4920 

55 CATCCCCCAG CCACGGAGAA GAGGGAAGAC CAAAAAGGGA GAGTGAGAAA GAGGATGAGA 4980 

GGGATGGTCA GCTGTGAGGG GAGGGGGCAA GTGGCCCAGC AAATGTTGAT GCCTCCCTTC 5040 

CCATCTTGCC ACACGGTCTT TTTCTTTTGT AGCACAGCCT CCATTAATAA CTCCTCGGCT 5100 

GAGGATGAAG ATGTAGGCAC CTTTACCCCC AGAGCCAGTT CCTTAATTGG CTGGCTTTCT 5160 

GAGATGCAGA CCACCCTAGA ATCTCATCTA GGTTCACTAG AAGTTAGTTA AATCTTCCTT 5220 

TCTC TGTCTT TCTCTTCATT CCATCCCCCA AACCCACCAA ACACTAAGGG AGAGCTCCCT 5280 

TTGGATGTCT GGGCAGTAAA CCTAGCTCAT TTTTCTAGGA GACCCAGAAG TGACTTCTGA 5340 

GTAGTTATCA CTGTGTCTGC CTCTGTTACA CTGTGCTGCT TTGCTTAAAC AGAAATGCAG 5400 

GCCTGGACAT CTGACTGTGC CTTTATATTC TGAGTGGGGT GCTGCCCCAT GCAAAAAAAT 5460 

CCAGAGAGGT AGTGAGGTGT CAGAGCTAAA CACTTGGTGC TGGGTTTTGT TGATGCTGGT 552 0 

ATAATGTGAC ACAGTACAAT TACATGCTAA ATTTTGCATT TTCTCTATAT AACATCTATT 5580 

TTTCCTGATA CTGTGCCTTT GCCATTTTGA TAATGCTATT TTGATTGAGT GAATTTTATT 5640 

TCCTTTGTAT TCCCATAGTG AACAATATAT TAAGGTAGAT GCCCTTTATC TGGGTAC TCC 5700 

TGGTAGATTA GCTGTTACAC CTCCCTTCCC TTTTTTACAG TGAACCTGTA TTCAGTTATT 5760 

GTCACTCTGA GAACTCTCCA ATAACAATTT CTTTTCCACA GTTAACAACA CAGCTGTTAC 5820 

ACCTCCCTTC CTTTTTTTCA CAGTGAACCT GTATTCAGCT ATTCTCACTC TGAGAACTCT 5880 

CCAATAACAA TTTCTTTTCC ACAGTTAACA ACAAAGTTCT GTTTTTAAAT GAAGAGATTA 5940 

AGTTCTTTTT AAATGCCTAA AGGCATATTC TGACAACTTT TCTACTTCTT TAACTTTTTT 6000 
GATTTAAGAT ATATGCAAAG CAAATAAATT CAATAAAGCC T 

75 SEQ ID NO:48 PDG5 Protein sequence 
Protein Accession #: BAA86524 

1 11 21 31 41 51 

E I 1 I I I 

EQPTTSQPET TTPQGLLSDK DDMGRRNAGI DFGSRKASAA QPIPENMDNS MVSDPQPYHE 60 



60 
65 
70 



80 



319 



Attorney Docket No.: 018501-004200US 



DAASGAEKTE ARASLSLMVE SLSTTQEEAI LSVAAEAQVF MNPSHIQLED QEAFSFDLQK 120 

AQSKMESAQD VQTICKEKPS GNVHQTFTAS VLGMTSTTAK GDVYAKTLPP RSLFQSSRKP 180 

DAEEVSSDSE NIPEEGDGSE ELAHGHSSQS LGKFEDEQEV FSESKSFVED LSSSEEELDL 240 

RCLSQALEEP EDAEVFTESS SYVEKYNTSD DCSSSEEDLP LRHPAQALGK PKNQQEVSSA 300 

5 SNNTPEEQND FMQQLPSRCP SQPIMNFTVQ QQVPTSSVGT SIKQSDSVEP IPPRHPFQPW 360 

VNPKVEQEVS SSPKSMAVEE SISMKPLPPK LLCQPLMNPK VQQNMFSGSE DIAVERVISV 420 

EPLLPRYSPQ SLTDPQIRQI SESTAVEEGT YVEPLPPRCL, SQPSERPKFL DSMSTSAEWS 480 

SPVAPTPSKY TSPPWVTPKF EELYQLSAHP ESTTVEEDIS KEQLLPRHLS QLTVGNKVQQ 540 

LSSNFERAAI EADISGSPLP PQYATQFLKR SKVQEMTSRL EKMAVEGTSN KSPIPRRPTQ 600 

10 SFVKFMAQQI FSESSALKRG SDVAPLPPNL PSKSLSKPEV KHQVFSDSGS ANPKGGISSK 660 

MLPMKHPLQS LGRPEDPQKV FSYSERAPGK CSSFKEQLSP RQLSQALRKP EYEQKVSPVS 72 0 

ASSPKEWRNS KKQLPPKHSS QASDRSKFQP QMSSKGPVNV PVKQSSGEKH LPSSSPFQQQ 780 

VHSSSVNAAA RRSVFESNSD NWFLGRDEAF AIKTKKFSQG SKNPIKSIPA PATKPGKFTI 840 

APVRQTSTSG GIYSKKEDLE SGDGNNNQHA NLSNQDDVEK LFGVRLKRAP PSQKYKSEKQ 900 

15 DNFTQLASVP SGPISSSVGR GHKIRSTSQG LLDAAGNLTK ISYVADKQQS RPKSESMAKK 960 

QPACKTPGKP AGQQSDYAVS EPVWITMAKQ KQKSFKAHIS VKE LKTKSNA GADAETKEPK 1020 

YEGAGSANEN QPKKMFTSSV HKQEKTAQMK PPKPTKSVGF EAQKILOVPA MEKETKRSST 1080 
LPAKFQNPVE PIEPVWFSLA RKKAKAWSHM AEITQ 

20 SEQ ID NO:49 PAB7 DNA SEQUENCE 

Nucleic Acid Accession #: D87742 

Coding sequence: 208-3582 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

25 i i lilt 

GCTTTCCTTT CTAAAGTAGA AGAGGATGAT TATCCCTCTG AAGAACTACT AGAGGATGAA 60 

jf AACGCTATAA ATGCAAAACG GTCTAAAGAA AAAAACCCTG GGAATCAGGG CAGGCAGTTT 120 

LJ GATGTTAATC TGCAAGTCCC TGACAGAGCA GTTTTAGGGA CCATTCATCC AGATCCAGAA 180 

ATTGAAGAAA GCAAGCAAGA AACTAGTATG ATTTTGGATA GTGAAAAAAC AAGTGAGACT 240 

™30 GCTGCCAAAG GGGTCAACAC AGGAGGCAGG GAACCAAATA CAATGGTGGA AAAAGAACGC 300 

J CCTCTGGCAG ATAAGAAAGC ACAGAGACCA TTTGAACGAA GTGACTTTTC TGACAGCATA 360 

~l AAAATTCAGA CTCCAGAATT AGGTGAAGTG TTTCAGAATA AAGATTCTGA TTATCTGAAG 420 

r - AACGACAACC CTGAGGAACA TCTGAAGACC TCAGGGCTTG CAGGGGAGCC TGAGGGAGAA 480 

■f* CTCTCAAAAG AGGACCATGG GAACACAGAG AAGTACATGG GCACAGAAAG CCAGGGGTCT 540 

35 GCTGCTGCAG AACCTGAAGA TGACTCGTTC CACTGGACTC CACATACAAG TGTAGAGCCA 600 

ri GGGCATAGTG ACAAGAGGGA GGACTTACTT ATCATAAGCA GCTTCTTTAA AGAACAACAG 660 

~\- TCTTTGCAGC GGTTCCAGAA GTACTTTAAT GTCCATGAGC TGGAAGCCTT GCTACAAGAA 720 

ATGTCATCAA AACTGAAGTC AGCGCAGCAG GAGAGCCTGC CCTATAATAT GGAAAAAGTC 780 

CTAGATAAGG TCTTCCGTGC TTCTGAGTCA CAAATTCTGA GCATAGCAGA AAAAATGCTT 840 

40 GATACTCGTG TGGCTGAAAA TAGAGATCTG GGAATGAACG AAAATAACAT ATTTGAAGAG 900 

GCTGCAGTGC TTGATGACAT TCAAGACCTC ATCTATTTTG TCAGGTACAA GCACTCCACA 960 

*"? GCAGAGGAGA CAGCCACACT GGTGATGGCA CCACCTCTAG AGGAAGGCTT GGGTGGAGCA 1020 

s ' s ATGGAAGAGA TGCAACCACT GCATGAAGAT AATTTCTCAC GAGAGAAGAC AGCAGAACTT 1080 

AATGTGCAGG TTCCTGAAGA ACCCACCCAC TTGGACCAAC GTGTGATTGG GGACACTCAT 1140 

=45 GCCTCAGAAG TGTCACAGAA GCCAAATACT GAGAAAGACC TGGACCCAGG GCCAGTTACA 1200 

y ACAGAAGACA CTCCTATGGA TGCTATTGAT GCAAACAAGC AACCAGAGAC AGCCGCCGAA 1260 

5' GAGCCGGCAA GTGTCACACC TTTGGAAAAC GCAATCCTTC TAATATATTC ATTCATGTTT 1320 

~ TATTTAACTA AGTCGCTAGT TGCTACATTG CCTGATGATG TTCAGCCTGG GCCTGATTTT 1380 

= TATGGACTGC CATGGAAACC TGTATTTATC ACTGCCTTCT TGGGAATTGC TTCGTTTGCC 1440 

50 ATTTTCTTAT GGAGAACTGT CCTTGTTGTG AAGGATAGAG TATATCAAGT CACGGAACAG 1500 

CAAATTTCTG AGAAGTTGAA GACTATCATG AAAGAAAATA CAGAACTTGT ACAAAAATTG 1560 

TCAAATTATG AACAGAAGAT CAAGGAATCA AAGAAACATG TTCAGGAAAC CAGGAAACAA 1620 

AATATGATTC TCTCTGATGA AGCAATTAAA TATAAGGATA AAATCAAGAC ACTTGAAAAA 1680 

AATCAGGAAA TTCTGGATGA CACAGCTAAA AATCTTCGTG TTATGCTAGA ATCTGAGAGA 1740 

55 GAACAGAATG TCAAGAATCA GGACTTGATA TCAGAAAACA AGAAATCTAT AGAGAAGTTA 180 0 

AAGGATGTTA TTTCAATGAA TGCCTCAGAA TTTTCAGAGG TTCAGATTGC ACTTAATGAA 1860 

GCTAAGCTTA GTGAAGAGAA GGTGAAGTCT GAATGCCATC GGGTTCAAGA AGAAAATGCT 192 0 

AGGCTTAAGA AGAAAAAAGA GCAGTTGCAG CAGGAAATCG AAGACTGGAG TAAATTACAT 1980 

GCTGAGCTCA GTGAGCAAAT CAAATCATTT GAGAAGTCTC AGAAAGATTT GGAAGTAGCT 2040 

60 CTTACTCACA AGGATGATAA TATTAATGCT TTGACTAACT GCATTACACA GTTGAATCTG 2100 

TTAGAGTGTG AATCTGAATC TGAGGGTCAA AATAAAGGTG GAAATGATTC AGATGAATTA 2160 

GCAAATGGAG AAGTGGGAGG TGACCGGAAT GAGAAGATGA AAAATCAAAT TAAGCAGATG 22 20 

ATGGATGTCT CTCGGACACA GACTGCAATA TCGGTAGTTG AAGAGGATCT AAAGCTTTTA 2280 

CAGCTTAAGC TAAGAGCCTC CGTGTC G ACT AAATGTAACC TGGAAGACCA GGTAAAGAAA 2340 

65 TTGGAAGATG ACCGCAACTC ACTACAAGCT GCCAAAGCTG GACTGGAAGA TGAATGCAAA 2400 

ACCTTGAGGC AGAAAGTGGA GATTCTGAAT GAGCTCTATC AGCAGAAGGA GATGGCTTTG 2460 

CAAAAGAAAC TGAGTCAAGA AGAGTATGAA CGGCAAGAAA GAGAGCACAG GCTGTCAGCT 2520 

GCAGATGAAA AGGCAGTTTC GGCTGCAGAG GAAGTAAAAA CTTACAAGCG GAGAATTGAA 2580 

GAAATGGAGG ATGAATTACA GAAGACAGAG CGGTCATTTA AAAACCAGAT CGCTACCCAT 2640 

70 GAGAAGAAAG CTCATGAAAA C TGGCTC AAA GCTCGTGCTG CAGAAAGAGC TATAGCTGAA 2700 

GAGAAAAGGG AAGCTGCCAA TTTGAGACAC AAATTATTAG AATTAACACA AAAGATGGCA 276 0 

ATGCTGCAAG AAGAAC CTGT GATTGTAAAA CCAATGCCAG GAAAACCAAA TACACAAAAC 2820 

CCTCCACGGA GAGGTCCTCT GAGCCAGAAT GGC TCTTTTG GCCCATCCCC TGTGAGTGGT 2880 

GGAGAATGCT CCCCTCCATT GACAGTGGAG CCACCCGTGA GACCTCTCTC TGCTACTCTC 2940 

75 AATCGAAGAG ATATGCCTAG AAGTGAATTT GGATCAGTGG ACGGGCCTCT ACCTCATCCT 3000 

CGATGGTCAG CTGAGGCATC TGGGAAACCC TCTCCTTCTG ATCCAGGATC TGGTACAGCT 3060 

ACCATGATGA ACAGCAGCTC AAGAGGCTCT TCCCCTACCA GGGTACTCGA TGAAGGCAAG 3120 

GTTAATATGG CTCCAAAAGG GCCCCCTCCT TTCCCAGGAG TCCCTCTCAT GAGCACCCCC 3180 

ATGGGAGGCC CTGTACCACC ACCCATTCGA TATGGACCAC CACCTCAGCT CTGCGGACCT 3240 

80 TTTGGGCCTC GGCCACTTCC TCCACCCTTT GGCCCTGGTA TGCGTCCACC ACTAGGCTTA 3300 



320 



AGAGAATTTG CACCAGGCGT TCCACCAGGA AGACGGGACC TGCCTCTCCA CCCTCGGGGA 3360 

TTTTTACCTG GACACGCACC ATTTAGACCT TTAGGTTCAC TTGGCCCAAG AGAGTACTTT 3420 

ATTCCTGGTA CCCGATTACC ACCCCCAACC CATGGTCCCC AGGAATACCC ACCACCACCT 3480 

GCTGTAAGAG ACTTACTGCC GTCAGGCTCT AGAGATGAGC CTCCACCTGC CTCTCAGAGC 3 540 

ACTAGCCAGG ACTGTTCACA GGCTTTAAAA CAGAGCCC AT AA AACTATGA CCTCTGAGGT 3600 

TTCATTGGAA AGAAAGTGTA CTGTGCATTA TCCATTACAG TAAAGGATTT CATTGGCTTC 3660 

AAAATCCAAA AGTTTATTTT AAAAGGTTTG TTGTTAGAAC TAAGCTGCCT TGGCAGTGTG 3720 

CATTTTTGAG CCAAACAATT CAAAAATGTC ATTTCTTCCC TAAATAAAAA TCACCTTTTA 3780 

AGCTAGAGCG TCCTTACAAC TTTGAAATGT GCAATAAAGA ATACCTGTGT TTTAGCTAAT 3 840 

GTAGCATATG TAATTGCAAA ATGATTTAGA ATGTCATGAA AAATATGAAC ATTTCC TGTG 3900 

GAAATGCTTT AAGAACATGT ATTTCC ATTA TCCTATTTTT AGTGTACACC AGCTGAATAC 3960 

GGAGCAATGG TGTTTATAAG CGTTTTTTTA AACTATCTGG TCACAAAGAC TGTTACGCTA 4020 

AAAATGTTTA CTAAAAGATC ACTAAACTAT CTCCCCTCTT GCTGAAGTTC TTTGTAGTAA 4080 

TAGCTCATAA AAATTTGTTT ATTAATATTT CCCAAGTGTC TGTTGACTCA TTGGACTGTT 4140 

ATGAGGCTTG TGCCATTTGG GGAACATGTA AACTCAGGCT CCCAGAACTG AAGATGGTGG 4200 

CTGGTGGCAC ACTTCCGGCT GCTCCTCCGT CACCTGTGAA CTCTACAAGT GATGTCTTTT 4260 

TATTTCAAAG AAGTTTATTT CCCACTTGTA TAGCATTCAC ATGCTTTCTT TACGATCCTC 4320 

ATTGTCTATT TGAGAATGGT TTTCTGAGAG TGAGTTTACA TTAGTAGCAA GAGTTGTTTG 4380 

ACCTGATGTT CCATTGTTTT TACCATTCCT GTAGAAAAAG GGTGCACAAC AGAAAAATGA 4440 

AAATGATGTG TC ATGGC CAT AAAAGTATAG AAATCTTTAA AAATTTTAAA ATGTACAGTC 4500 

CCTTATCTAT C TTTCCC ATT CCTTGCCACT GATTTTTGAG GAATATAATA AAAAGATTGG 4560 

AAGAGTATAA TGCCATGAGA AAGAATGATT TAGGACTGTG AGGGTTATAA CATGCCCTAG 4620 

GTCAGCAACC AAGGGTTGAA ATCAGTTCTG TTTTAGGGGG AAATGGGGGG GGCGACAGAT 4680 

ATTATTC CAA AATTAATATT AATTAATATT TAAACGTTGG TGTTTTTATT TAAAAATCAG 4740 

TAACTAACCA TCTGGAATTG CACCATACTT AAAGTCTTAT CCATTACTAC ACTGTCTTTA 4800 

AAACAATGTT TC TTTAAATA CTCTACAACG TTTCTAAGAA CGAACTTCAG ACATTTTAAT 4860 

TACAGTAATA ATAGCACTCC TTTTAAGGAG TTTCAGATCC ACACTAAAAC TAAAATCATA 4920 

AAAGGCTGAT AC TTTTGTTT GCTGCTAGGC TATATTCTTC CATTCTTTGA AGTCCTATGA 4980 

TGTAATATTT TTGAAACCTA GTGTATGTCT TGTCACTGTT GTGATATTTA ATCGATTAAG 5040 

AATACCTTGT AAAAAGGAGC AAAAGCTTCA ATGTGAAACA ATTTTCTCTC TTTATACTAA 5100 

ACAACTGAAG ATAGATAGTT TAGAAAGATA AGGACCTTTG AAAGAAGACA ACTCTGTCAA 5160 

AGTTCATAAG GAATATAAAA ATTCTTCAGG AAAAGAGAAT TCAATCTATA TGTCCTCCCG 5220 

TTTAATATCA AGAATAGAAG AAATTAAGAG GAAAACTCCA CAGAAGAGCA TAGGCCACTT 5280 

TTAGCCATGT AAAAATAAGA TTAAGTCACA AATACAACTT TTGAATTTAC CTGTCAATAT 5340 

CTCTTTAGGA CACAAAACAA TGCTGAAGTT AATATAATTT CTAATTTTAA ATGTCATTTA 5400 

AGTGTAGATT ATGCCATCTA GGAAGGTAAG TAGGAAAGGT AAATTAAATC TATTTTTAAA 5460 

ATTCAAAATA TTAGAGTATT TTTCCCCTCT AAAGC CTTTT TTGGTGATTA TTCTGTATCT 5520 

GACATAATTG AGAAACTGGT AAGCTGTAAA GATTCCAGTG TAGCTTCTCT GAGAAGTTGT 5580 

GAGCCAGTCC ATAACTGCTT CCTCACATCC ATCTGATTGC ACCATTTCTG CAGCAAACCC 5640 

CAAAGCAGGG TGCCAATATG CAGATGGCAT AGGGAGTATC ATCCCTCAGC CAAATCACTT 5700 

TTCCATCTCT AAAGTTTCAT CTATTTTGGA AGTCATCTCC AACTAATTGT GTCTGGATTT 57 60 

AGTTGCTAAA ATTGTCTTAT TTATTTATGA AGCAGCAATA TTCAGCCTGA AAGCATTTCT 5820 

GCCATAGTTG TTGTAGTTAT ATCGCCAATG GCTGATTTTT TTCATTGGAA AGTAAATTTA 5880 

AGTAATTCGT GGGATGTGGT ATATTCTGTG TCAACTTCAA GATAATCACT CATTTTCTCG 5940 
TTATATTCAG GTCTGAATTA AAGTTAAGTT AATCAC 



SEQ ID NO.-50 PAB7 Protein sequence 
Protein Accession #: BAA13448 

1 11 21 31 41 51 

I I ! I I I 

AFLSKVEEDD YPSEELLEDE NAINAKRSKE KNPGNQGRQF DVNLQVPDRA VLGTIHPDPE 60 

IEESKQETSM ILDSEKTSET AAKGVNTGGR EPNTMVEKER PLADKKAQRP FERSDFSDSI 120 

KIQTFELGEV FQNKDSDYLK NDNPEEHLKT SGLAGEPEGE LSKEDHGNTE KYMGTESQGS 180 

AAAEPEDDSF HWTPHTSVEP GHSDKREDLL IISSFFKEQQ SLQRFQKYFN VHELEALLQE 240 

MSSKLKSAQQ E S L P YNMEKV LDKVFRASES QILSIAEKML DTRVAENRDL GMNENNIFEE 300 

AAVLDDIQDL IYFVRYKHST AEETATLVMA PPLEEGLGGA MEEMQPLHED NFSREKTAEL 360 

NVQVPEEPTH LDQRVIGDTH ASEVSQKPNT EKDLDPGPVT TEDTPMDAID ANKQPETAAE 420 

EPASVTPLEN AILLIYSFMF YLTKSLVATL PDDVQPGPDF YGLPWKPVFI TAFLGIASFA 480 

I F LWRTVLW KDRVYQVTEQ QISEKLKTIM KENTELVQKL SNYEQKIKES KKHVQETRKQ 540 

NMILSDEAIK YKDKIKTLEK NQEILDDTAK NLRVMLESER EQNVKNQDLI SENKKSIEKL 600 

KDVISMNASE FSEVQIALKE AKLSEEKVKS ECHRVQEENA RLKKKKEQLQ QEIEDWSKLH 660 

AELSEQIKSF EKSQKDLEVA LTHKDDNINA LTNCITQLNL LECESESEGQ NKGGNDSDEL 720 

ANGEVGGDRN EKMKNQIKQM MDVSRTQTAI SWEEDLKLL QLKLRASVST KCNLEDQVKK 780 

LEDDRNSLQA AKAGLEDECK TLRQKVEILN ELYQQKEMAL QKKLSQEEYE RQEREHRLSA 840 

ADEKAVSAAE EVKTYKRR I E EMEDELQKTE RSFKNQIATH EKKAHENWLK ARAAERAIAE 900 

EKREAANLRH KLLELTQKMA MLQEEPVIVK PMPGKPNTQN PPRRGPLSQN GSFGPSPVSG 960 

GECSPPLTVE PPVRPLSATL NRRDMPRSEF GSVDGPLPHP RWSAEASGKP SPSDPGSGTA 1020 

TMMNSSSRGS SPTRVLDEGK VNMAPKGPPP FPGVPLMSTP MGGPVPPPIR YGPPPQLCGP 1080 

FGPRPLPPPF GPGMRPPLGL REFAPGVPPG RRDLPLHPRG FLPGHAPFRP LGSLGPREYF 1140 
IPGTRLPPPT HGPQEYPPPP AVRDLLPSGS RDEPPPASQS TSQDCSQALK QSP 

SEQ ID N0:51 PAB9 DNA SEQUENCE 

Nucleic Acid Accession #: NM_0O6457 

Coding sequence: 84-1 874 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I I I 

AGACTGAGGC GGAGGCAGCC CCGCGCCGCG CCGGACCCGA GCATATTTCA TTTTCTGTCA 60 
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TTGGACTTTG 
CTTGGGGTTT 
TAAAAGATGG 
TTGATGGAAT 
GTACAGGCTC 
TTCCTGTTCA 
CTGTGTCCAA 
GTTCTGTGTC 
CCCATGCGAC 
TGTTCGCTGC 
CACTGAGCGC 
GTTCCGAGAC 
AACAGCAAAA 
TACCCACTCA 
CAAGAACTGG 
AACATTTGAA 
CTCCGCAGTT 
CAACCTCTGG 
GATCCACTGG 
CTGGAAGAAT 
TGGGACAAAC 
CAGGGAAACG 
TGGCACTGGG 
TGGCCTACAT 
AATTCTTTGC 
CGTTGAAACA 
GGAACAATGT 
TCTTTGGTAC 
AAGCTCTGGG 
TGGAAGGTCA 
CT GTGA ATTT 
AAATTAAAAT 
AGTGGCCCTG 
CATAAAGTAA 
AAATAAGCTT 
AGTGAAGAAT 
TGTTAGGTAG 
AACAGAATTA 
TTAAACAGAG 
GCGCGGTGGC 
GAGGTCAGGA 
ACAAAAATTA 
GCACGAGAAT 
CACTCCAGCC 
TATTTTTGCC 
TGTGTCATGC 
GCTACCATAT 
TTCATTGTGT 
GTAAAGATTT 
AATAGAGGGC 
GGGCGGATCA 
CTACTAAAAA 
GGGAGGCTGA 
CACACCACTG 



AGCCATTAGA 
CCGGCTGCAG 
CGGCAAGGCA 
AAATGCACAA 
TTTGAATATG 
AAAGGGAGAA 
AGTCACTTCC 
TTCACCAAAA 
CACCTCATCA 
ATCTGGACTG 
TGGTAAAACT 
TTCTCAGGAG 
TGGCCCACCA 
CAGTGATGCC 
AACAACTCAG 
AGAATC TGAA 
GGCTTCCTTG 
CAGACCAGGG 
CGTCATCAAG 
CTCAAACAGC 
CCAGCCAAGT 
AACTCCGATG 
GAAATCTTGG 
TGGATTTGTA 
CCCTGAATGT 
AAC TTGGC AT 
TTTTCACTTG 
TATATGCCAT 
CTACACCTGG 
GACCTTTTTC 
TTGAAAGTCA 
TACTAATTAA 
AAGGAATAAA 
AGAGACGGTT 
TATAAAAACC 
TTAATTTTAG 
TTATGAGTAA 
TTGTATTTAA 
AATTTTATCA 
TCACGCCTGT 
GTTTGAGATC 
GCCGGACGCA 
CACTTGAACC 
TGGGTGACAG 
TTACAGTGGA 
CAGTAAGAGA 
AGCTTATAAG 
ATGCTTCATC 
AATTAAATAA 
CAGGTGTGGT 
TGAGGTCAAG 
TACAAAAATG 
GGCAGGAAAA 
CACTCCAGCC 



ACCATGAGCA 
GGCGGTAAGG 
GCCCAGGCAA 
GGAATGACTC 
ACTCTGCAAA 
CCTAAAGAAG 
ACAAACAACA 
GTCACATCCA 
CATGCTTCCC 
CATGCTAATG 
GCAGTTAATG 
CTAGCAGAGG 
AGAAAACACA 
AGCAAGAAGA 
TCTCGCTCTT 
GCCGATAATA 
GTAGCTTCCA 
GTTACCAGCC 
TCACCAAGCT 
GC TACIT ACT 
GACCAGGACA 
TGCGCCCATT 
CACCCAGAAG 
GAGGAGAAAG 
GGTCGATGCC 
GTTTCCTGTT 
GAGGATGGTG 
GGATGTGAAT 
CATGACACTT 
TCCAAGAAGG 
ACAGTTCAGG 
TTTTTAGATT 
TTCCAGCTTT 
TGGCATTTAT 
AATTTCCTGA 
AATAAATAAT 
ATCTGCAAAA 
AAAAAAACTA 
GTAATAGGTG 
AATCCCAGCA 
AGCCTGGCCA 
GTGGCACGCG 
CGGGAGGGAG 
AGTGAGACTC 
TCATTCTAGT 
TGTTATATTC 
TCTCAAATTT 
ACCTATATTA 
TTTTGGCCTC 
GGCTCACGCC 
AGATCAAGAT 
AGCTGGGCAT 
TTCTTGAACC 
TGGTGACAGA 



ACTACAGTGT 
ATTTCAACAT 
ATGTAAGAAT 
ATCTTGAAGC 
GAGCATCTGC 
TAGTTAAACC 
TGGCCTACAA 
TCCCATCACC 
CTTCACCCGT 
CCAATCTTAG 
TCCCACGGCA 
GACAGAGAAG 
TTGTGGAGCG 
GACTGATTGA 
TCCGAATCCT 
CAAAGAAGGC 
CACGGAGCAT 
TCACAACTGC 
GGCAACGGCC 
CAGGATCAGT 
CTTTAGTGCA 
GTAACCAGGT 
AATTCAACTG 
GAGCCCTGTA 
AAAGGAAGAT 
TTGTGTGTGT 
AACCCTACTG 
TTCCCATAGA 
GCTTTGTATG 
ACAAGCCCCT 
AGAAGAGAAG 
CAATATTTAT 
AAAAACCAAG 
TATTACTTTT 
TGGACTATTA 
CCAATCTGAA 
GGCAATGAAA 
ATACTTATCT 
TCAGTTTTTA 
CTTTGGGAGG 
ACATGGTGAA 
CCTGTAATCC 
AGGTTGCAGT 
CGTCTCCAAA 
AGGAAAGGAC 
TTTTCTTATT 
TTGCCTTTTA 
GGCAAATTCC 
TCATAGTTTT 
TGTGATCCCA 
CATCCTGGCC 
GGTGGGGCGT 
CAGGAGACGG 
GCAAGACTCC 



GTCACTGGTT 
GCCTCTGACA 
AGGCGATGTG 
CCAGAATAAG 
TGCACCCAAG 
TGTGCCCATT 
TAAGGCACCA 
ATCGTCTGCC 
GGCTGCCGTC 
TGCTGACCAG 
GCCCACAGTC 
AGGATCCCAG 
CTATACAGAG 
GG AT AC TGAA 
TGCCCAGATC 
AAATAACTCT 
GCCCGAGAGC 
AGCTGCCTTC 
AAACCAAGGA 
GGCACCAGCC 
AAGAGCTGAG 
CATCAGAGGA 
CGCTCACTGC 
TTGTGAGCTG 
CCTTGGAGAA 
AGCCTGTGGA 
TGAGACTGAT 
AGCTGGTGAC 
CTCAGTGTGT 
GTGTAAGAAA 
GAATTTGAAG 
ATGGAGTTTT 
TCTGAGGAAA 
TCCTGTATTT 
AATTCATCTT 
ATAATTATAC 
ATGCCTTAAA 
TTAAAATAGT 
AAAAATTGCT 
CCAAGGTGGG 
ACCCCATCTC 
CAGCTACTCA 
GAGCCAAGAT 
AAAAAACTTT 
AATAAGATTT 
TCTTCCCCAC 
CTAAAATGTG 
ATTTTTTCCC 
CTCTCTCTTT 
GCACTTTGGG 
AACATGGTGA 
GCCTGTAGTC 
AAGTTGCAGT 
GGCTCTT 



GGCCCAGCTC 
ATCTCTAGTC 
GTTCTCAGCA 
ATTAAGGGTT 
CCTGAGCCGG 
ACATCTCCTG 
CGGCCTTTTG 
TTCACCCCAG 
ACTCCTCCCC 
TCTCCATCTG 
ACCAGCGTGT 
GGTGACAGTA 
TTTTATCATG 
GACTGGCGTC 
ACTGGGACTG 
CAGGAGCCTT 
CTGGACAGCC 
AAGCCTGTAG 
GTAGCTTCCA 
AACTCAGC TT 
CACATTCCAG 
CCATTCTTAG 
AAAAATACAA 
TGCTATGAGA 
GTCATCAATG 
AAGCCCATTC 
TATTATGCCC 
ATGTTCCTGG 
TGTGAAAGTT 
CATGCTCATT 
AGAAAAAGGA 
GAAAAATAAT 
TATTTGGCTT 
TATGCCCATA 
AGAATAAATT 
CTTCTTTCCT 
TTTTATCAAT 
AAATAGGATT 
TGTAGGCTGA 
TGGACCACAT 
TACTAAAAAT 
AGAGGCTGAG 
CGTACC AC TG 
GCTTGTATAT 
TTTATCAAAA 
CCAAAAATAA 
ATTGTTTCTG 
TTGCGCTAAG 
AAAGAGAATA 
AGGCCAAGAC 
AACCCTGTCT 
CCATGTACTT 
GAGCTGAGAT 



12 0 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 



SEQ ID NO;52 PAB9 Protean sequence 
Protein Accession*: NPJXJ6448 



I 

1 MSNYSVSLVG 
61 MTHLEAQNKI 
121 NNKAYNKAPR 
181 ANANLSADQS 
241 KHIVERYTEF 
301 DNTKKANNSQ 
361 FSWQRPNQGV 
421 AHCNQVIRGP 
481 RCQRKILGEV 
541 CEFPIEAGDM 



11 

I 

PAPWGFRLQG 
KGCTGSLNMT 
PFGSVSSPKV 
PSALSAGKTA 
YHVPTHSDAS 
EPSPQLASLV 
PSTGRISNSA 
FLVALGKSWH 
INALKQTWHV 
FLEALGYTWH 



21 

1 

GKDFNMPLTI 
LQRASAAPKP 
TSIPSPSSAF 
VNVPRQPTVT 
KKRLIEDTED 
ASTRSMPESL 
TYSGSVAPAN 
PEEFNCAHCK 
SCFVCVACGK 
DTCFVCSVCC 



31 

I 

SSLKDGGKAA 
EPVPVQKGEP 
TPAHATTSSH 
SVCSETSQEL 
WRPRTGTTQS 
DSPTSGRPGV 
SALGQTQPSD 
NTMAYIGFVE 
PIRNNVFHLE 
ESLEGQTFFS 



41 

i 

QANVRIGDW 
KEWKPVPIT 
ASPSPVAAVT 
AEGQRRGSQG 
RSFRILAQIT 
TSLTTAAAFK 
QDTLVQRAEH 
EKGALYCELC 
DGEPYCETDY 
KKDKPLCKKH 



51 

i 

LSIDGINAQG 
SPAVSKVTST 
PPLFAASGLH 
DSKQQNGPPR 
GTEHLKESEA 
PVGSTGVIKS 
I PAGKRTPMC 
YEKFFAPECG 
YALFGTICHG 
AHSVNF 



SEQ ID N0:53 PBH7 DNA SEQUENCE 

Nucleic Acid Accession #: AA431 407 

Coding sequence: 1 -864 (underlined sequences correspond to start and stop codons) 



60 
120 
180 
240 
300 
360 
420 
480 
540 



11 



51 



21 31 41 

I I I I I I 

ATGGCCAACT GTAAAATGAC CAAAAGCATC AGGTTC CCTG CCCTGGAGCA CTGCTATACT 6 0 

GGCGGGGAGG TCGTGTTGCC CAAGGATCAG GAGGAGTGGA AAAGACGGAC GGGCCTTCTG 120 

CTCTACGAGA ACTATGGGCA GTCGGAAACG GGACTAATTT GTGCCACCTA CTGGGGAATG 180 
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AAGATCAAGC CGGGTTTCAT GGGGAAGGCC ACTCCACCCT ATGACGTC C A GTTTCATATG 240 

GAGGCCTCAG TTGAAAACTG CATTATTGTG AGCATGAACA CCGCTGACCC TGGCAGCCAG 300 

GGCATCACAC ACAGCCTCTT GCTACAGGTC ATTGATGACA AGGGCAGCAT CCTGCCACCT 360 

AACACAGAAG GAAACATTGG CATCAGAATC AAACCTGTCA GGCCTGTGAG CCTCTTCATG 420 

5 TGCTATGAGG GTGACCCAGA GAAGACAGCT AAAGTGGAAT GTGGGGACTT CTACAACACT 480 

GGGGACAGAG GAAAGATGGA TGAAGAGGGC TACATTTGTT TCCTGGGGAG GAGTGATGAC 540 

ATCATTAATG CCTCTGGGTA TCGCATCGGG CCTGCAGAGG TTGAAAGCGC TTTGGTGGAG 600 

CACCCAGCGG TGGCGGAGTC AGCCGTGGTG GGCAGCCCAG ACCCGATTCG AGGGGAGGTG 660 

GTGAAGGCCT TTATTGTCCT GACCCCACAG TTCCTGTCCC ATGACAAGGA TCAGCTGACC 720 

10 AAGGAAC TGC AGCAGCATGT CAAGTCAGTG ACAGCCCCAT ACAAGTACCC AAGGAAGGTG 7 80 

GAGTTTGTCT CAGAGCTGCC AAAAACCATC ACTGGCAAGA TTGAACGGAA GGAACTTCGG 840 

AAAAAGGAGA CTGGTCAGAT G TAA TCGGCA GTGAACTCAG AACGCACTGC ACACCTGAGG 900 

CAAATCCCTG GCCACTTTAG TCTCCCCACT ATGGTGAGGA CGAGGGTGGG GCATTGAGAG 960 

TGTTGATTTG GGAAAGTATC AGGAGTGCCA TGATTCCAAT GTTTTCCTTC TTTTAAATTA 1020 

15 AATTCAGTTG CTCTGCTTCC TCCAAGTCCT CTGTATCTTT AGAATTTCCC AGGTGAGCAC 1080 
TCATAACGCA AGTAATAAAA TACTGATATC AACAA 



20 



SEQ ID NO:54 PBH7 Protein sequence 
Protein Accession #: FGENESH predicted 



l 11 21 31 41 51 

I I 1 I I I 

MANCKMTKSI RFPALEHCYT GGEWLPKDQ EEWKRRTGLL LYENYGQSET GLICATYWGM 60 
KIKPGFMGKA TPPYDVQFHM EASVENCIIV SMNTADPGSQ GITHSLLLQV IDDKGSILPP 120 
25 NTEGNIGIRI KPVRPVSLFM CYEGDPEKTA KVECGDFYNT GDRGKMDEEG YICFLGRSDD 180 

IINASGYRIG P AEVE SALVE HPAVAE SAW GSPDPIRGEV VKAFIVLTPQ FLSHDKDQLT 240 
"T. KELQQHVKSV TAPYKYPRKV EFVSELPKTI TGKIERKELR KKETGQM 

SEQ ID NO:55 PBJ5 DNA SEQUENCE 

^30 Nucleic Acid Accession #: AF388200 

Coding sequence: 33-1 37 (underlined sequences correspond to start and stop codons) 

^ l 11 21 31 41 51 

%<. \ I I I I I 

^pJ GAGAGAGGGA GGCAGAAGAG GAAGTCAGAG CGATGTGCTG TGAAATCTAC TACCGTTTGC 60 

til TGGTTTTGAA AATGGAGAAA AAGAGTGAGG AACTGAGAAA CATGGATGGC CTTGGGAACG 120 

\%% TGGAAAAGGG TCACTGAAAT GGGACGAC AT GA ACTCAAGG AGGCTATTTA TGACCATGTC 180 

ATTTGCAACA TGAAGAAAGC TTATCTGGAG TGAAAGTAAA TGAGACCAAC AGAGATAAGA 240 

* GACCCGGAGA AATCCTGGTT ACACTGCTTG AATCCTGTCA GTCCTATACT GGAGTCCTGT 300 

;j40 TAATACAAAA TAATAGTAAT AATCC CTCTG TTTCTTATGT TTATGCCAAC TTCAACAAAA 360 

AGAAACTTGA CTAAGAGACA ATATAAGAAC TTAATGTGTA ATTAAGAAAG AACTCTCCAC 420 

I "i CACGGGGAAT GTGAAAGGTA TATGAGTCCC TTTTCACGAT GCGATGTCAT GTCTTTTAAA 480 
"™ TAAGCCATAC TTTATGTTCA ATAAAAAGAG AATAAGCAGG A 

: 45 SEQ ID NO:56 PBJ5 Protein sequence 
; !; : Protein Accession #: AAK83352 

1 11 21 31 41 51 

%n 1 I I I I I 

DU MCCEIYYRLL VLKMEKKSEE LRNMDGLGKV EKGH 

SEQ ID NO:57 PBJ7 DNA SEQUENCE 

Nucleic Acid Accession #: AA876910 

Coding sequence: 1 -2064 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I ! I 

ATGGACAGTT GCCTGCAACA TATGAGAGAC CTACTTTACC TCCTTCAGGA GCTCAGGTGT 60 

TTAAATCCAG CTACACTACT CCCTGATCCA GACTCCACTA CTCCTGTTCA TGACTGTCAG 120 

60 GATCTGTTGG AAACTACCAA AACTGGCCAA CCTGATCTTC AAGATGTGCC CCTAGAAAAG 180 

GCAGATGCCA CTGTGTTCAC AGATGGTAGC AGCTTCCTCG AGCAGGGAGA ACGAAAAGCT 240 

GTTTC TTTTC CACAGCCAGA TCTGCCTGAC AATCCCACAT ACTCAACAGA AGAAGAAAAA 300 

CTGGCTTCAG ATGTTGGAGC AAATAAAAAT CAGGAAGGAC GTGTATTCGC AAACACTACT 360 

TGGAGGGCCG GTACCTCCAA GGAAGTCTCC TTTGCAGTTG ATTTATGTGT ACTGTTCCCA 420 

65 GAGCCAGCTC GTACCCATGA AGAGCAACAT AATTTGCCGG TCATAGGAGC AGGAAGTGTC 480 

GACCTTGCAG CAGGATTTGG ACACTCTGGG AGCCAAACTG GATGTGGAAG CTCCAAAGGT 540 

GCAGAAAAAG GGCTCCAAAA TGTTGACTTT TACCTCTGTC CTGGAAATCA CCCTGACGCT 600 

AGCTGTAGAG ATACTTACCA GTTTTTCTGC CCTGATTGGA CATGTGTAAC TTTAGCCACC 6 60 

TACTCTGGGG GATCAACTAG ATCTTCAACT CTTTCCATAA GTCGTGTTCC TCATCCTAAA 720 

70 TTATGTACTA GAAAAAATTG TAATCCTCTT ACTATAACTG TCCATGACCC TAATGCAGCT 780 

CAATGGTATT ATGGCATGTC ATGGGGATTA AGACTTTATA TCCCAGGATT TGATGTTGGG 340 

ACTATGTTCA CCATCCAAAA GAAAATCTTG GTCTCATGGA GCTCCCCCAA GCCAATCGGG 900 

CCTTTAACTG ATCTAGGTGA CCCTATATTC CAGAAACACC CTGACAAAGT TGATTTAACT 960 

GTTCCTC TGC CATTCTTAGT TCCTAGACCC CAGCTACAAC AACAACATCT TCAACCCAGC 1020 

75 CTAATGTCTA TACTAGGTGG AGTACACCAT CTCCTTAACC TCACCCAGCC TAAACTAGCC 1080 

CAAGATTGTT GGC TATGTTT AAAAGCAAAA CCCCCTTATT ATGTAGGATT AGGAGTAGAA 1140 

GCCACACTTA AACGTGGCCC TCTATCTTGT CATACACGAC CCCGTGCTCT CACAATAGGA 1200 

GATGTGTCTG GAAATGCTTC CTGTC TGATT AGTACC GGGT ATAACTTATC TGCTTCTCCT 1260 

TTTCAGGCTA CTTGTAATCA GTCCCTGCTT ACTTCCATAA GCACCTCAGT CTCTTACCAA 1320 

80 GCACCCAACA ATACCTGGTT GGCCTGCACC TCAGGTCTCA CTCGCTGCAT TAATGGAACT 1380 
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GAACCAGGAC CTCTCCTGTG CGTGTTAGTT CATGTACTTC CCCAGGTATA TGTGTACAGT 1440 

GGACCAGAAG GACGACAACT CATCGCTCCC CCTGAGTTAC ATCCCAGGTT GCACCAAGCT 1500 

GTCCCACTTC TGGTTCCCCT ATTGGCTGGT CTTAGCATAG CTGGATCAGC AGCCATTGGT 1560 

ACGGCTGCCC TGGTTCAAGG AGAAACTGGA CTAATATCCC TGTCTCAACA GGTGGATGCT 1620 

GATTTTAGTA ACCTCCAGTC TGCCATAGAT ATACTACATT CCCAGGTAGA GTCTCTGGCT 1680 

GAAGTAGTTC TTCAAAACTG CCGATGC TTA GATCTGCTAT TCCTCTCTCA AGGAGGTTTA 1740 

TGTGCAGCTC TAGGAGAAAG TTGTTGCTTC TATGCCAATC AATCTGGAGT CATAAAAGGT 1800 

ACAGTAAAAA AAGTTCGAGA AAATCTAGAT AGGCACCAAC AAGAACGAGA AAATAACATC 1860 

CCCTGGTATC AAAGCATGTT TAACTGGAAC CCATGGCTAA CTACTTTAAT CACTGGGTTA 1920 

GCTGGACCTC TCCTCATCCT ACTATTAAGT TTAATTTTTG GGCCTTGTAT ATTAAATTCG 1980 

TTTCTTAATT TTATAAAACA ACGCATAGCT TCTGTCAAAC TTACGTATCT TAAGACTCAA 2040 
TATGACACCC TTGTTAATAA CTGA 

SEQ ID NO:5B PBJ7 Protein sequence 

Protein Accession #: FGENESH predicted 

1 11 21 31 41 51 

I 1 1 1 1 ' 

MDSCLQHMRD LLYLLQELRC LNPATLLPDP DSTTPVHDCQ DLLETTKTGQ PDLQDVPLEK 60 

ADATVFTDGS SFLEQGERKA VSFPQPDLPD NPTYSTEEEK LASDVGANKN QEGRVFANTT 120 

WRAGTSKEVS FAVDLCVLFP EPARTHEEQH NLPVIGAGSV DLAAGFGHSG SQTGCGSSKG 180 

AEKGLQNVDF YLCPGNHPDA SCRDTYQFFC PDWTCVTLAT YSGGSTRSST LSISRVPHPK 240 

LCTRKNCNPL T I TVHDPNAA QWYYGMSWGL RLYIPGFDVG TMFTIQKKIL VSWSSPKPIG 300 

PLTDLGDPIF QKHPDKVDLT VPLPFLVPRP QLQQQHLQPS LMSILGGVHH LLNLTQPKLA 360 

QDCWLCLKAK PPYYVGLGVE ATLKRGPLSC HTRPRALTIG DVSGNASCLI STGYNLSASP 420 

FQATCNQSLL TSISTSVSYQ APNNTWLACT SGLTRCINGT EPGPLLCVLV HVLPQVYVYS 480 

GPEGRQLIAP PELHPRLHQA VPLLVPLLAG LSIAGSAAIG TAALVQGETG LISLSQQVDA 540 

DFSNLQSAID ILHSQVESLA EWLQNCRCL DLLFLSQGGL CAALGESCCF YANQSGVIKG 600 

TVKKVRENLD RHQQERENNI PWYQSMFNWN PWLTTLITGL AGPLLILLLS LIFGPCILNS 660 
FLNFIKQRIA SVKLTYLKTQ YDTLVNN 

SEQ ID NO:59 PCQ1 DNA SEQUENCE 

Nucleic Acid Accession #; NM_019005 

Coding sequence: 182-1885 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I 1 I I I I 

TGATGGTGGA AATTTCTTGA AACCGCTCTC GTAATTTGCC ACGTGCTGTT GCAAATATTC 60 

TGGTGAATGA ACACAGAATC AGCATGGCTT TCCTTTGCTG AGAAATCACT GATGGGAAGT 120 

GAGACTTGTT AAAC TTGAAA GTGAATGGAC CTGAGTGGAC CCTTTGATCA CATCAGTAAA 180 

CATGAGCGGT ACCAAACCTG ATATTTTATG GGCACCACAC CATGTTGATA GATTTGTTGT 240 

GTGTGACTCA GAACTAAGTC TTTATCATGT GGAATCTACT GTGAATTCAG AACTCAAAGC 300 

TGGATCTTTA CGTTTATCTG AAGACTCTGC AGCTACATTA CTGTCAATAA ATTCAGATAC 360 

ACCCTATATG AAATGTGTTG CCTGGTATCT TAATTATGAT CCTGAATGTC TGCTGGCAGT 420 

TGGACAAGCA AATGGTCGAG TTGTACTTAC AAGCCTTGGT CAAGATCATA ACTCAAAGTT 480 

CAAAGATTTG ATAGGAAAAG AGTTTGTTCC AAAACATGCA CGACAATGTA ATACCCTTGC 540 

CTGGAATCCA CTGGATAGTA ACTGGCTAGC TGC TGGTTTA GATAAGCACA GAGCTGACTT 600 

TTCAGTGCTA ATATGGGATA TCTGCAGCAA ATATACTCCT GATATAGTTC CCATGGAAAA 660 

AGTGAAACTT TCAGCAGGTG AAACTGAAAC AACATTATTA GTAACAAAAC C AC TTTATGA 720 

GTTAGGACAG AATGATGCTT GTCTGTCTCT TTGTTGGCTT CCACGAGACC AGAAACTTCT 780 

CCTTGCTGGT ATGCATCGTA ACCTAGCTAT ATTTGATCTT CGGAATACAA GCCAAAAGAT 840 

GTTCGTAAAT ACAAAAGCTG TTCAGGGTGT GACGGTAGAC CCATATTTCC ACGATCGTGT 900 

TGCTTCCTTC TATGAAGGTC AGGTTGCAAT ATGGGATCTT AGAAAATTTG AGAAGCCAGT 960 

TTTGACATTG ACTGAGCAAC CAAAACCCTT AACAAAAGTA GCATGGTGTC CCACTAGGAC 1020 

TGGTCTACTT GCCACTTTAA CAAGGGATAG TAATATTATT AGATTGTATG ATATGCAGCA 1080 

TACACCCACT CCCATTGGGG ATGAAACTGA ACCCACAATA ATTGAAAGAA GTGTGCAACC 1140 

TTGTGACAAT TACATTGCTT CCTTTGCGTG GCATCCAACA AGTCAAAATC GAATGATAGT 1200 

TGTAACTCCC AACCGAACAA TGTCAGACTT CACTGTTTTT GAAAGGATAT CTCTTGCCTG 1260 

GAGCCCAATT ACATCTTTAA TGTGGGCTTG TGGTCGTCAT TTATATGAAT GTACGGAAGA 1320 

AGAAAATGAT AATTCTTTAG AAAAAGATAT AGCAACGAAG ATGCGTCTTC GGGCTTTATC 1380 

AAGGTATGGA CTTGATACAG AGCAGGTGTG GAGGAACCAC ATTTTAGCTG GAAATGAAGA 1440 

TCCACAGCTC AAGTCACTCT GGTATACTCT GCACTTTATG AAGCAATACA CAGAAGATAT 1500 

GGATCAGAAA TCTCCAGGCA ACAAAGGATC ATTGGTTTAT GCAGGAATTA AATCAATTGT 1560 

AAAGTCATCG TTGGGAATGG TGGAAAGCAG CAGACATAAT TGGAGTGGGT TGGATAAGCA 1620 

AAGTGATATT CAAAACTTAA ATGAAGAGAG AATCTTAGCT TTACAGCTTT GTGGGTGGAT 1680 

AAAGAAAGGA ACGGATGTAG ACGTGGGGCC ATTTTTGAAC TCCCTTGTAC AAGAAGGGGA 1740 

ATGGGAAAGA GCTGCTGCTG TGGCATTGTT CAACTTGGAT ATTCGCC GAG CAATCCAAAT 1800 

CCTGAATGAA GGGGCATCTT CTGAAAAAGG CAGGAGATCT GAATCTCAAT GTGGTAGCAA 1860 

TGGCTTTATC GGGTTATACG G ATGA GAAGA ACTCCCTTTG GAGAGAAATG TGTAGCACAC 1920 

TGCGATTACA GCTAAATAAC CCGTATTTGT GTGTCATGTT TGCATTTCTG ACAAGTGAAA 1980 

CAGGATCTTA CGATGGAGTT TTGTATGAAA ACAAAGTTGC AGTACGTGAC AGAGTGGCAT 2040 

TTGCTTGTAA ATTCCTTAGT GATACTCAGA TACATCGAAA AGTTGACCAA TGAAATGAAA 2100 

GAGGCTGGAA ATTTGGAAGG AATTTTGCTT ACAGGCCTTA CTAAAGATGG AGTGGACTTA 2160 

ATGGAGAGTT ATGTTGATAG AACTGGAGAT GTTCAAACAG CAAGTTACTG TATGTTACAG 2220 

GGTTCACCTT TAGATGTTCT TAAAGATGAA AGGGTTCAGT ACTGGATTGA GAATTATAGA 2280 

AATTTATTAG ATGCCTGGAG GTTTTGGCAT AAACGAGCTG AATTTGATAT TCACAGGAGT 2340 

AAGTTGGATC CCAGTTCCAA GCCTTTAGCA CAAGTTTTTG TGAGTTGCAA TTTCTGTGGC 2400 

AAGTCAATCT CCTACAGCTG TTCAGCTGTG CCTCATCAGG GCAGAGGTTT TAGTCAGTAT 2460 

GGTGTGAGTG GCTCACCAAC GAAATCTAAA GTCACAAGTT GTCCTGGCTG TCGAAAACCA 2520 

CTTCCTCGAT GTGCGCTTTG TCTCATTAAT ATGGGAACAC CAGTTTCTAG CTGTCCTGGA 2580 
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GGAACCAAAT CAGATGAAAA AGTGGACTTG AGCAAGGACA AAAAATTAGC CCAATTTAAC 2640 
AACTGGTTTA CATGGTGTCA TAATTGCAGG CACGGTGGAC ATGCTGGACA TATGCTTAGT 2700 
TGGTTCAGGG ACCATGCAGA GTGCCCTGTG TCTGCATGCA CGTGTAAATG TATGCAGTTG 2760 
GATACAACGG GGAATCTGGT ACCTGCAGAG ACTGTCCAGC CATAAAATGT TACCACCTTA 2 820 
AGAGAACCCT TCAAGTGTGG AGCTTTCTAG TAGGTGTCCT TCATAGCTCA GAAACATACC 2880 
TCAGAACAAG CCATTCATGA CTTACCTGTA ATGGGAAAAT AAATCATTCT ATCAGAAAAA 2940 
AAAAAAAAAA AAAAAAAAAA 

SEQ ID NO;60 PCQ1 Protein sequence 
Protein Accession #: NP_06 1 878 

1 11 21 31 41 51 

| I I t I I 

MSGTKPDILW APHHVDRFW CDSELSLYHV ESTVNSELKA GSLRLSEDSA ATLLSINSDT 60 

PYMKCVAWYL NYDPECLLAV GQANGRWLT SLGQDHNSKF KDLIGKEFVP KHARQCNTLA 120 

WNPLDSNWLA AGLDKHRADF SVLIWDICSK YTPDIVPMEK VKLSAGETET TLLVTKPLYE 180 

LGQNDACLSL CWLPRDQKLL LAGMHRNLAI FDLRNTSQKM FVNTKAVQGV TVDPYFHDRV 240 

ASFYEGQVAI WDLRKFEKPV LTLTEQPKPL TKVAWC PTRT GLLATLTRDS NIIRLYDMQH 300 

TPTPIGDETE PTIIERSVQP CDNYIASFAW HPTSQNRMIV VTPNRTMSDF TVFERISLAW 360 

SPITSLMWAC GRHLYECTEE ENDNSLEKDI ATKMRLRALS RYGLDTEQVW RNHILAGNED 42 0 

PQLKSLWYTL HFMKQYTEDM DQKSPGNKGS LVYAGIKSIV KSSLGMVESS RHNWSGLDKQ 480 

SDIQNLNEER ILALQLCGWI KKGTDVDVGP FLNSLVQEGE WERAAAVALF NLDIRRAIQI 540 
LNEGASSEKG RRSESQCGSN GFIGLYG 

SEQ ID N0:61 PDG3 DNA SEQUENCE 

Nucleic Acid Accession #: U42359 

Coding sequence: 563-775 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I 1 ! 

TTGTACATCT TAACAACCTT AAGCTGTACA AATAGANCAA TAATATCTAA ATGGTGTGAT 60 

G ATCAGCC C A CAGTACACAT CATTGATGAG AATTTCACTG GTCTCAACCT TTCTCATGCT 120 

GAGTCCTGGC TTTGTAAAAT GACTTATAAA GGTCCAAGGA TTTAGAGATG ATTAAGAGAT 180 

AAGCTGGCAT TCTGTAAAGG CACCATCGTC TATCC CCTGT CTTATCTAGA TAAAGAATGT 240 

AGTGCTAAAT CTTGTAATAA TATTGTACAA ATGGAAATTC AATCTTAAGG ATTATTTTTT 300 

CCATATTGTT GTATTTCATT GTGGTGTATT GGAAAGTGAT CTGGACTTTG AGTGAGAAGA 360 

TGTGATTTGG ACCATGGCAC TTAAAAACTC TATAACCTCA GGCAAGTCTT TTAATCTTCT 420 

CTGAGCCTCA GTTTTCCTCA TTTTTCAAAT ATAGAGAGTA TAACATTTAT CTCATAAGAC 480 

AAGTTGTAGT AAATTACTGT TTTACAAATG TAAGATAACT TTTAACTGTG AGATTCCATA 540 

TTCCAGTCTT ACATTAT TAT G TTTATCTGC CACAGGGAGA AGTCCTCAGA TAAAAATGTC 600 

TACCAAAAGA CTGACACGTG GAGTTAATCA TTTGACAGAT GCAAATGCTT CCACCCCCAA 660 

CAAATATACT TTCTTTAACT TCTGTGTGGG TATCACTTAG GGAAAAAAAG GCAGGCAACA 720 

AAATATTTTT TAATTCTATC TTAGGAAAAA TTGTAGNCAA ATCTTTTTNT CCCAT TAA CA 780 

AATAATGTAA GCCTTAATAT TCAAGGGGTA ATAAAAATAC AAAGTCTTCC AAACAGGTAA 840 
CTTACTTGAA AACTTT 

SEQ ID NO:82 PDG3 Protein sequence 
Protein Accession #: AAB18375 

1 11 21 31 41 51 

I ! 1 1 ' 1 

MGARGAPSRR RQAGRRLRYL PTGSFPFLLL LLLLCIQLGG GQKKKENLLA EKVEQLMEWS 60 

SRRSIFRMNG DKFRKFIKAP PRNYSMIVMF TALQPQRQCS VCRQANEEYQ ILANSWRYSS 120 

AFCNKLFFSM VDYDEGTDVF QQLNMNSAPT FXHXPPKGRP KRADTFDLQR IGFAAEQLAK 180 

WIADRTDVHI RVFRPPNYSG TIALALLVSL VGGLLYXRRN NLEF I YNKTG WAMVSLCIVF 240 

AMTSGQMWNH IRGPPYAHKN PHNGQVSYIH GSSQAQFVAE SHIILVLNAA ITMGMVLLNE 300 
AATSKGDVGK RRIICLVGLG LWFFFSFLL SIFRSKYHGY PYSDLDFE 

SEQ ID NO:63 PDG8 DNA SEQUENCE 

Nucleic Acid Accession #: AL080235 

Coding sequence: 245-453 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I I ! I 

GGTCGCCGCA CCGGCCGCCT CCGGCCCGCC GCCGCCCCCA GCGCCGCCGC CGCCACCGCC 60 

GGGGCGCCCA CCGCGCTGCC AGCCTACCCC GCGGCCGAGC CGCCCGGGCC GCTGTGGCTG 120 

CAGGGCGAGC CGCTGCATTT CTGCTGCCTA GACTTCAGCC TGGAGGAGCT GCAGGGCGAG 180 

CCGGGCTGGC GGCTGAACCG TAAGCCCATT GAGTCCACGC TGGTGGCCTG CTTCATGACC 240 

CTGGTCATCG TGGTGTGGAG CGTGGCCGCC CTCATCTGGC CGGTGCCCAT CATCGCCGGC 300 

TTCCTGCCCA AC GGC ATGGA ACAGCGCCGG ACCACCGCCA GCACCACCGC AGCCACCCCC 360 

GCCGCAGTGC CCGCAGGGAC CACCGCAGCC GCCGCCGCCG CCGCCGCTGC CGCCGCCGCC 420 

GCGGCCGTCA CTTCGGGGGT GGCGACCAAG TGA CCCGCTC CGCTCCTCCC TGTGTCCGTC 480 

CTGTGTCCGC GCGCGCGGGT GCCTTTCCCG CCGGGGACTC GGCCGGTGTG CTTCGTGCTG 540 

TAGTTATCGT TAGTTCCTCT TCCCGAGATG GGGCCGCCGA GAGACCCCAG CGCCTTTGAA 600 

AAGCAAGGTT TGTGCTGCGC TTCCAGTTCC GAAAAGCAGA TGTTTAAGCC CTTGGACTGA 660 

GGGTGGGATC GCAGCTCCGA AGACGGAGAG GAGGGAAATG GGGCCCTTTC CCCTCTATTG 720 

CATCCCCCTG CCCGACTCCT TCCCCGCACC CACGTGCCCT AGATTCATGG CAGAAAATGA 7 80 

CCAAATCCTG TGTATTTGTT TTATATATTT AATAACTGTT TTAAATGAAA GTTTTAGTAA 840 

AAAAAATACA AAACAAAAAG ATTAAATTGC TATTGCTGTA GTAAGAGAAG CTCTTTGTAT 900 

CTGAACATAG TTGTATTTGA AATTTGTGGT TTTTTAATTT ATTTAAAATT GGGGGGAGGG 960 
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CATGGGAAGG ATTTAACACC GATATATTGT TACCGCTGAA AATGAAC TTT ATGAACCTTT 1020 
TCCAAGTTGA TCTATCCAGT GACGTGGCCT GGTGGGCGTT TCTTCTTGTA CTTATGTGGT 1080 
TTTTTGGCTT TTAATACAGA CATTTTCCTC CAAAAAAAAA AAAAAAAAGG 

5 SEQ ID NO:64 PDG8 Protein seouence 
Protein Accession #: CAB45781 

1 11 21 31 41 51 

10 GRRTGRLRPA AAPSAAAATA GAPTALPAYP AAEPPGPLWL QGEPLHFCCL DFSLEELQGE 60 
PGWRLNRKPI ESTLVACFMT LVIWWSVAA LIWPVPIIAG FLFNGMEQRR TTASTTAATP 120 
AAVPAGTTAA AAAAAAAAAA AAVTSGVATK 

SEQ ID NO:65 PDM1 DNA SEQUENCE 

15 Nucleic Acid Accession #: NM.006765 

Coding sequence: 149-1 1 95 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

20 j | fit! 

CGGCCGCGGC CCGGGTCCCT CGCAAAGCCG CTGCCATCCC GGAGGGCCCA GCCAGCGGGC 60 

TCCCGGAGGC TGGCCGGGCA GGCGTGGTGC GCGGTAGGAG CTGGGCGCGC ACGGCTACCG 12 0 

CGCGTGGAGG AGACACTGCC CTGCCGCG AT G GGGGCCCGG GGCGCTCCTT CACGCCGTAG 180 

GCAAGCGGGG CGGCGGCTGC GGTACCTGCC CACCGGGAGC TTTCCCTTCC TTCTCCTGCT 240 

„25 GCTGCTGCTC TGCATCCAGC TCGGGGGAGG ACAGAAGAAA AAGGAGAATC TTTTAGCTGA 300 

Z, AAAAGTAGAG CAGCTGATGG AATGGAGTTC CAGACGCTCA ATCTTCCGAA TGAATGGTGA 360 

4 TAAATTCCGA AAATTTATAA AGGCACCACC TCGAAACTAT TCCATGATTG TTATGTTCAC 420 

==■' TGCTCTTCAG CCTCAGCGGC AGTGTTCTGT GTGCAGGCAA GCTAATGAAG AATATCAAAT 480 

j: ACTGGCGAAC TCCTGGCGCT ATTCATCTGC TTTTTGTAAC AAGCTCTTCT TCAGTATGGT 540 

"30 GGACTATGAT GAGGGGACAG ACGTTTTTCA GCAGCTCAAC ATGAACTCTG CTCCTACATT 600 

^ CAYGCATTTW CCTCCAAAAG GCAG AC CTAA GAGAGCTGAT ACTTTTGACC TCCAAAGAAT 660 

f\ TGGATTTGCA GCTGAGCAAC TAGCAAAGTG GATTGCTGAC AGAACGGATG TTCATATTCG 720 

™" GGTTTTCAGA CCACCCAACT ACTCTGGTAC CATTGCTTTG GCCCTGTTAG TGTCGCTTGT 7 80 

V TGGAGGTTTG CTTTATTNGA GAAGGAACAA CTTGGAGTTC ATCTATAACA AGACTGGTTG 840 

■r-35 GGCCATGGTG TCTCTGTGTA TAGTCTTTGC TATGACTTCT GGCCAGATGT GGAACCATAT 900 

* ' ; CCGTGGACCT CCATATGCTC ATAAGAACCC ACACAATGGA CAAGTGAGCT ACATTCATGG 960 

0 GAGCAGCCAG GCTCAGTTTG TGGCAGAATC ACACATTATT CTGGTACTGA ATGCCGCTAT 1020 

CACCATGGGG ATGGTTCTTC TAAATGAAGC AGCAACTTCG AAAGGCGATG TTGGAAAAAG 1080 

ACGGATAATT TGCCTAGTGG GATTGGGCCT GGTGGTCTTC TTCTTCAGTT TTCTACTTTC 1140 

^tO AATATTTCGT TCCAAGTACC ACGGCTATCC TTATAGTGAT CTGGACTTTG A GTGA GAAGA 1200 

TGTGATTTGG ACCATGGCAC TTAAAAACTC TATAACCTCA GCTTTTTAAT TAAATGAAGC 1260 

3 CAAGTGGGAT TTGCATAAAG TGAATGTTTA CCATGAAGAT AAAC TGTTCC TGAC TTT AT A 1320 

7 CTATTTTGAA TTCATTCATT TCATTGTGAT CAGCTAGCTT ATTCTTGTGT ACTTTTTTTA 1380 

™ AACTGTGGGT TTTCCTAGTA AATTTAATTT ACAGAAATCA ATGGTAGCAT TTAGTAATCT 1440 

jt5 ACAAAGGAAA TATCAAAGTG TTTTTCAAGC CTGTTATATY CAGTGTGTKC CACAGGATTG 1500 
Z, CAATAAATGA CAATGTAATT A 

SEQ ID NO:66 PDM1 Protein sequence - 
50 Protein Accession #: NP 006756 

1 11 21 31 41 51 

1 1 1 1 1 1 

MGARGAPSRR RQAGRRLRYL PTGSFPFLLL LLLLCIQLGG GQKKKENLLA EKVEQLMEWS 60 

55 SRRSIFRMNG DKFRKFIKAP PRNYSMIVMF TALQPQRQCS VCRQANEEYQ ILANSWRYSS 12 0 

AFCNKLFFSM VDYDEGTDVF QQLNMNSAPT FXHXPPKGRP KRADTFDLQR IGFAAEQLAK 180 

WIADRTDVHI RVFRPPNYSG TIALALLVSL VGGL LYXRRN NLEFIYNKTG WAMVSLCIVF 240 

AMTSGQMWNH IRGPPYAHKN PHNGQVSYIH GSSQAQFVAE SHIILVLNAA ITMGMVLLNE 300 
AATSKGDVGK RRIICLVGLG LWFFFSFLL SIFRSKYHGY PYSDLDFE 



60 



65 



SEQ ID NO:67 PDM2 DNA SEQUENCE 

Nucleic Acid Accession #: NM_000947 

Coding sequence: 88-1 61 7 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I i I i I I 

GGTTTCATAT GAACTCTCCC GCCACCCGGG AACAGCTGGC TGCCACCGTT TGTGTTTTCC 60 

70 GAGTTTGTAT TCTTGCAGGT GACCAAGATG GAGTTTTCTG GAAGAAAGCG GAGGAAGCTG 120 

AGGTTGGCAG GTGACCAGAG GAATGCTTCC TACCCTCATT GCCTTCAGTT TTACTTGCAG 180 

CCACCTTCTG AAAACATATC TTTAACAGAA TTTGAAAACT TGGCTATTGA TAGAGTTAAA 240 

TTGTTAAAAT CAGTTGAAAA TCTTGGAGTG AGCTATGTGA AAGGAAC TGA ACAATACCAG 300 

AGTAAGTTGG AGAGTGAGCT TCGGAAGCTC AAGTTTTCCT ACAGAGAGAA GCTAGAAGAT 360 

75 GAATATGAAC CACGAAGAAG AGATCATATT TCTCATTTTA TTTTGCGGCT TGCTTATTGC 420 

CAGTCTGAAG AACTTAGACG CTGGTTCATT CAACAAGAAA TGGATCTCCT TCGATTTAGA 480 

TTTAGTATTT TACCCAAGGA TAAAATTCAG GATTTCTTAA AGGATAGCCA ATTGCAGTTT 540 

GAGGCTATAA GTGATGAAGA GAAGACTCTT CG AGAACAGG AGATTGTTGC CTCATCACCA 600 

AGTTTAAGTG GACTTAAGTT GGGGTTCGAG TCCATTTATA AGATCCCTTT TGCTGATGCT 660 

80 CTGGATTTGT TTCGAGGAAG GAAAGTCTAT TTGGAAGATG GCTTTGCTTA CGTACCACTT 720 
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AAGGACATTG TGGCAATCAT CCTGAATGAA TTTAGAGCCA AACTGTCCAA GGCTTTGGCA 780 
TTAACAGCCA GGTCCTTGCC TGCTGTGCAG TCTGATGAAA GACTTC AGC C TCTGCTCAAT 840 
CACCTCAGTC ATTCCTACAC TGGCCAAGAT TACAGTACCC AGGGAAATGT TGGGAAGATT 900 
TCTTTAGATC AGATTGATTT GCTTTCTACC AAATCCTTCC CACCTTGCAT GCGTCAGTTA 960 

CATAAAGCCT TGCGGGAAAA TCACCATCTT CGTCATGGAG GCCGAATGCA GTATGGCCTA 1020 

TTTC TGAAGG GCATTGGTTT AACTTTGGAA CAGGCATTGC AGTTCTGGAA GCAAGAATTT 1080 

ATCAAAGGAA AGATGGATCC AGACAAGTTT GATAAAGGTT ACTCTTACAA CATCCGTCAC 1140 

AGCTTTGGAA AGGAAGGCAA GAGGACAGAC TATACACCTT TCAGTTGCCT GAAGATTATT 1200 

CTGTCCAATC CACCAAGCCA AGGGGATTAT CATGGGTGCC CATTCCGTCA CAGTGATCCA 1260 

GAGCTGCTGA AGCAAAAGTT GCAGTCATAC AAGATCTCTC CTGGAGGGAT AAGCCAGATT 1320 

TTGGATTTAG TAAAGGGGAC ACATTACCAG GTAGCCTGTC AAAAATACTT TGAGATGATA 13 80 

CACAATGTGG ATGATTGTGG CTTTTCTTTG AATCATCCTA ATCAGTTCTT TTGTGAGAGC 1440 

CAACGTATTC TAAATGGTGG TAAAGACATA AAGAAGGAAC CTATCCAACC AGAAACTCCT 1500 

CAACCCAAAC CAAGTGTCCA GAAAACCAAG GATGCATCAT CTGCTCTGGC CTCTTTAAAT 1560 

TCC TCTCTGG AAATGGATAT GGAAGGACTA GAAGATTACT TTAGTGAAGA TTCTTAGGCA 1620 

GTTTTATAAC CCTTTTTCCT CAATAGCCTG TTTCCTGTTT TTAAGATTTT GCCTTTGTTG 1680 

TTGAAAAAGG GTTTCACTGT CACCAAGGCT TAGTGCAGTG ACACAATTAC AGCTGATTGC 1740 

AGCCTTGACC TTCCCAGCTC AAGTGATCCT CCTACCTCAG CCTCCCAAGT AGTTAGGACA 1800 

CACAGGTGTG CACCTCATAT CCAGATAATT TTTTTCAATT TTTTTTTGTA GAGGTGGGGG 1860 

GTCTC CCTAT GTTGCCCAGG CAGATCTCAG ACTCCTGGGC TCAAGCGATC CTCACACCTC 1920 

AGCGTCCCAG AGTGCTGGGA TTACAGTTGT GAGCCACTGT GCCTGGCCTT TTTTTTTTTT 1980 

TAACCTTTTC GTTTAACTTC TC TCTTCACT GCATCCCAAT CCATCTACAG GCATGCACAC 2040 

TTATTAGGAA AGGAGGTTTG AGGTAACAAC AGAGACTTTC ACTATATTTT GCTTTGACAG 2100 

AAGGAAAGAG GAGGAGTTTC TATTAAAATC TGTCACTTGA GTGATGTCAT TTAAGTCCTA 2160 

TTTTAGGAGA TAAAAACAGC TTTGGGGACT GGTTAAAGTC CCCCAGAAAC TACAATAAAG 2220 

AACAACTTTT GTTTTAACTC TTAATCACTT TGTAATTTTG ACTCAATCCT TTTCTGGACC 2280 
ATTTTTGTTA ATAAATATCA AAGTGT 



SEQ ID NO:68 PDM2 Protein sequence: 
Protein Accession #: NP.000938 

l 11 21 31 41 51 

i I 1 I I I 

MEFSGRKRRK LRLAGDQRNA SYPHCLQFYL QPFSENISLT EFENLAIDRV KLLKSVENLG 60 

VSYVKGTEQY QSKLESELRK LKFSYREKLE DEYEPRRRDH ISHFILRLAY CQSEELRRWF 120 

IQQEMDLLRF RFSILPKDKI QDFLKDSQLQ FELAI SDEEKT LREQEIVASS PSLSGLKLGF 180 

ESIYKIPFAD ALDLFRGRKV YLEDGFAYVP LKDIVAIILN EFRAKLSKAL ALTARSLPAV 240 

QSDERLQPLL NHLSHSYTGQ DYSTQGNVGK ISLDQIDLLS TKSFPPCMRQ LHKALRENHH 300 

LRHGGRMQYG LFLKGIGLTL EQALQFWKQE FIKGKMDPDK FDKGYSYNIR HSFGKEGKRT 360 

DYTPFSCLKI ILSNPPSQGD YHGC PFRHSD PELLKQKLQS YKISPGGISQ ILDLVKGTHY 420 

QVACQKYFEM IHNVDDCGFS LNHPNQFFCE SQRILNGGKD IKKEPIQPET PQPKPSVQKT 480 
KDASSALASL NS S LEMDMEG LEDYFSEDS 

SEQ ID NO:69 PDM3 DNA SEQUENCE 

Nucleic Acid Accession #; NM_024840 

Coding sequence: 108491 (underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

[ I I 1 t 1 

AATTCATACA GGAGAGAAGT CATATATATG CAGTGATTGT GGAAAAGGCT TCATCAAGAA 60 

GTCTCGGCTC ATTAATCATC AGAGAGTTCA TACAGGAGAG AAACCAC ATG GATGCAGCCT 120 

GTGTGGGAAG GCCTTCTCCA AAAGGTCCAG GCTCACTGAA CACCAGAGAA CTCATACAGG 180 

AGAGAAGCCC TATGAATGCA CTGAATGTGA CAAAGCATTC CGCTGGAAAT CACAGCTCAA 240 

TGCACATCAG AAAGCTCACA CAGGAGAGAA GTCATATATA TGCCGTGATT GTGGAAAAGG 300 

CTTCATTCAG AAGGGAAATC TCATTGTACA TCAGCGAATT CATACTGGAG AAAAACCCTA 360 

TATATGCAAT GAATGTGGAA AAGGCTTCAT CCAAAAGGGC AACCTCCTTA TTCATCGACG 420 

TACTCACACT GGAGAGAAAC CCTATGAATG CAATGAATGT GGGAAAGGCT TCAGCCAGAA 480 

GACATGTT TA A TATCCCATC AGAGATTTCA CACAGGAAAG ACACCCTTTG TATGTACTGA 540 

GTGTGGAAAA TCCTGCTCAC ACAAGTCAGG TCTCATTAAC CACCAGAGAA TTCACACAGG 600 

AGAGAAACCC TATACATGCA GTGACTGTGG GAAAGCTTTC AGAGATAAAT CATGTCTCAA 660 

CAGACATCGG AGAACTCATA CAGGGGAGAG ACCGTATGGA TGCTCTGATT GTGGGAAAGC 720 

TTTCTCCCAC TTGTCATGCC TTGTTTATCA TAAGGGAATG CTGCATGCAA GAGAGAAATG 7 80 

TGTAGGTTCA GTCAAATTGG AAAATCCTTG CTCAGAGAGT CATAGCTTAT CACATACACG 840 

TGATCTCATA CAGGATAAAG ACTCTGTTAA CATGGTGACT CTGCAGATGC CTTCTGTGGC 900 

AGCTCAGACC TCATTAACTA ACAGTGCGTT CCAAGCAGAG AGCAAAGTAG CCATTGTGAG 960 

CCAGCCTGTT GCCAGAAGTT CAGTCTCAGC AGATAGTAGA ATTTGCACAG AATAAAAACC 1020 

ATATGAATGC AGTGAATGTG GTAGTGCTTT CAGTGATCAA TTACATCATA TGTCACAAAA 1080 

AACACAGAGG AACAAACTGA TATATTCAAG GTGGAAAGCC CTTGAATAAA ACCTTATGGC 1140 

TAATAAGCAT ATACTCAGAG AAAAATAGTA TGAAGTGGAG ACTGGGAAAT TCTTTTATGG 1200 

GAAGATAGAT CTTCTCATCA GTGACCATAG ATCACATCTT CAGTGAGCTT ATAGTTGGTA 1260 

GAAATATAAT GATCATGGAA AAGTCCTTGT TCAGAAACAG TACGCCAGTA GGTATCAGGG 1320 

GGTTTACACA GGAGAGAAAC TTTTGGAAGA CCTTTGAAGG CTATGAATGT GGCAGGGTTG 1380 

CTAGTGGTAC ATTC TGCCTT ATCCTCAGAG GGAATCATAT AGAAATAAAA CTATGAAAAT 144 0 

GTAACTAGAA CATC TTCATC AAAATATGAA AGAACACACG AAGCAAATAA GCCCTGTGAA 1500 

AAGGAGTATT TTAGAGATTT CGATCAGAAA TCTAACATCA TTATATGGCA GATAATATAC 1560 

AGGATGTGTA TTTTAGGACA AT ATACC TTG AATCACTAGT TGATATGTCA ATGACTAATT 1620 

AAAAGGGGTT GTCAGTGTTA CACATCATTG GTTAAATTTA TAGCACAATG TACCTCTTCC 1680 

CCCTTTTTTG ATAAGAGTCT TCTATTCCCA ACCAAGATCA TTATATGATT AGCTCTTGTG 1740 

TTTCTTTGAT TCCAAATTTC TTCACTTGTT ATTTCAGACT ACTGAAGCTC TTCAAAAGGA 1800 
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AAAATGTATT TAATTTAATA ATGTAACACA ACAAGTTTGG ATGTGTTTAA CTTTATAAAT 1860 
AATCACCCCA GAGGAATGAA GTTCAAAACT TGTGAATAAC C 



5 SEQ ID NO:70 PDM3 Protein sequence: 
Protein Accession #: NP_0791 16 

1 11 21 31 41 51 

10 MDAACVGRPS PKGPGSLNTR ELIQERSPMN ALNVTKHSAG NHSSMHIRKL TQERSHIYAV 60 
IVEKASFRRE ISLYISEFIL EKNPIYAMNV EKASSKRATS LFIDVLTLER NPMNAMNVGK 120 
ASARRHV 

SEQ ID N0:71 PDM8 DNA SEQUENCE 

15 Nucleic Acid Accession*: NMJH8455 

Coding sequence: 341-955 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

! I i 1 I 1 

20 AATTTCGGCA CGGGGGGGAG GCACAGTGAG TCCACTGGGG CACGGCAGCG TCTAAGCCAC 60 

AAGCCGACTG ACATAAGCCA GGTCCTAACG GAGCCTATGT GTAAGTCCAC TACTGGTGCA 120 

AGGTTGCACA CTTCTAAGAA GAGCGGCGTG GGGGGCTCGG CGACCTTCGC TTCAGTCGCT 180 

CCCCCGTGCA GTCCCCTGTG CCCAAGACAC AGCCTGATGC TTGTGCTCCG GTGGGCGGAC 240 

TTGGAGGCGG CGGGAACTGC AATTGGTGGC TTTGAAGGGC GGCGAGCGGG AACAGCTCTT 300 

-25 GAGGAGTGAG ACTGCAGGAG ATGTGGGCCG TGCCAAAGAG ATGGATGAGA CTGTTGCTGA 360 

GTTCATCAAG AGGACCATCT TGAAAATCCC CATGAATGAA CTGACAACAA TCCTGAAGGC 420 

^ CTGGGATTTT TTGTCTGAAA ATCAACTGCA GACTGTAAAT TTCCGACAGA GAAAGGAATC 480 

~ TGTAGTTCAG CACTTGATCC ATCTGTGTGA GGAAAAGCGT GCAAGTATCA GTGATGCTGC 540 

^ CCTGTTAGAC ATCATTTATA TGCAATTTCA TCAGCACCAG AAAGTTTGGG ATGTTTTTCA 600 

J 30 GATGAGTAAA GGACCAGGTG AAGATGTTGA CCTTTTTGAT ATGAAACAAT TTAAAAATTC 660 

2 GTTCAAGAAA ATTCTTCAGA GAGCATTAAA AAATGTGACA GTCAGCTTCA GAGAAACTGA 720 

H GGAGAATGCA GTCTGGATTC GAATTGCCTG GGGAACACAG TACACAAAGC CAAACCAGTA 780 

r, CAAACCTACC TACGTGGTGT ACTACTCCCA GACTCCGTAC GCCTTCACGT CCTCCTCCAT 840 

4 GCTGAGGCGC AATACACCGC TTCTGGGTCA GGAGTTAGAA GCTACTGGGA AAATCT AC C T 900 

=135 CCGACAAGAG GAGATCATTT TAGATATTAC CGAAATGAAG AAAGCTTGCA ATTAGTGAAC 960 
1 ATGAAAGGAA AATAAAAATT CCTCACAGTC AAAAAAAAAA AAAAA 

SEQ ID N0:72 PDM8 Protein sequence: 
Protein Accession #: NP.060925 

^ l 11 21 31 41 51 

^ MDETVAEFIK RTILKIPMNE LTTILKAWDF LSENQLQTVN FRQRKESWQ HLIHLCEEKR 60 

ASXSDAALLD IIYMQFHQHQ KVWDVFQMSK GPGEDVDLFD MKQFKNSFKK ILQRALKNVT 120 

^45 VSFRETEENA VWIRIAWGTQ YTKPNQYKPT YWYYSQTPY AFTSSSMLRR NTPLLGQELE 180 

ATGKIYLRQE EIILDITEMK KACN 

SEQ ID NO:73 PDM9 DNA SEQUENCE 

Nucleic Acid Accession*: NM_0I6192 
50 Coding sequence: 1 -1 125 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

1 I I I I i 

ATGGTGCTGT GGGAGTCCCC GCGGCAGTGC AGCAGCTGGA CACTTTGCGA GGGCTTTTGC 60 

55 TGGCTGCTGC TGCTGCCCGT CATGCTACTC ATCGTAGCCC GCCCGGTGAA GCTCGCTGCT 120 

TTCCCTACCT CCTTAAGTGA CTGCCAAACG CCCACCGGCT GGAATTGCTC TGGTTATGAT 180 

GACAGAGAAA ATGATCTCTT CCTCTGTGAC ACCAACACCT GTAAATTTGA TGGGGAATGT 240 

TTAAGAATTG GAGACACTGT GACTTGCGTC TGTCAGTTCA AGTGCAACAA TGACTATGTG 300 

CCTGTGTGTG GCTCCAATGG GGAGAGCTAC CAGAATGAGT GTTACCTGCG ACAGGCTGCA 360 

60 TGCAAACAGC AGAGTGAGAT ACTTGTGGTG TCAGAAGGAT CATGTGCCAC AGATGCAGGA 420 

TCAGGATCTG GAGATGGAGT CCATGAAGGC TCTGGAGAAA CTAGTCAAAA GGAGACATCC 480 

ACCTGTGATA TTTGCCAGTT TGGTGCAGAA TGTGACGAAG ATGCCGAGGA TGTCTGGTGT 540 

GTGTGTAATA TTGACTGTTC TCAAACCAAC TTCAATCCCC TCTGCGCTTC TGATGGGAAA 600 

TCTTATGATA ATGCATGCCA AATCAAAGAA GCATCGTGTC AGAAACAGGA GAAAATTGAA 660 

65 GTCATGTCTT TGGGTCGATG TCAAGATAAC ACAACTACAA CTACTAAGTC TGAAGATGGG 720 

CATTATGCAA GAACAGATTA TGCAGAGAAT GCTAACAAAT TAGAAGAAAG TGCCAGAGAA 780 

CACCACATAC CTTGTCCGGA ACATTACAAT GGCTTC TGC A TGCATGGGAA GTGTGAGCAT 840 

TCTATCAATA TGCAGGAGCC ATCTTGCAGG TGTGATGCTG GTTATACTGG ACAACACTGT 900 

GAAAAAAAGG ACTACAGTGT TCTATACGTT GTTCCCGGTC CTGTACGATT TCAGTATGTC 960 

70 TTAATCGCAG CTGTGATTGG AACAATTCAG ATTGCTGTCA TCTGTGTGGT GGTCCTCTGC 1020 

ATCACAAGGA AATGCCCCAG AAGCAACAGA ATTCACAGAC AGAAGCAAAA TACAGGGCAC 1080 
TACAGTTCAG ACAATACAAC AAGAGCGTCC ACGAGGTTAA TCTGA 
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SEQ ID NO:74 PDM9 Protein sequence: 
Protein Accession*: NP.057276 



10 

15 

20 

25 

30 

35 

40 

45 

50 

55 

60 

65 

70 

75 

80 



I 

1 MVLWESPRQC 

61 DRENDLF LCD 
121 CKQQSEILW 
181 VCNIDCSQTN 
241 HYARTDYAEN 
301 EKKDYSVLYV 
361 YSSDNTTRAS 



11 
I 

SSWTLCEGFC 
TNTCKFDGEC 
SEGSCATDAG 
FNPLCASDGK 
ANKLEESARE 
VPGPVRFQYV 
TRLI 



21 

I 

WLLLLPVMLL 
LRIGDTVTCV 
SGSGDGVHEG 
SYDNACQIKE 
HHIPCPEHYN 
LIAAVIGTIQ 



31 
I 

rVARPVKLAA 
CQFKCNNDYV 
SGETSQKETS 
ASCQKQEKIE 
GFCMHGKCEH 
IAVICWVLC 



41 

! 

FPTSLSDCQT 
PVCGSNGESY 
TCDICQFGAE 
VMSLGRCQDN 
SINMQEPSCR 
ITRKCPRSNR 



51 

I 

PTGWNCSGYD 
QNECYLRQAA 
CDEDAEDVWC 
TTTTTKSEDG 
CDAGYTGQHC 
IHRQKQNTGH 



SEQ ID NO:75 PD01 DNA SEQUENCE 

Nucleic Acid Accession #: NM_0 14324 

Coding sequence: 89-1237 (underlined sequences correspond to start and stop codons) 



1 

GGCGCCGGGA 
TTCCTTCAGC 
GTCCGGCCTG 
GGTACGCGTG 
CTCGCTAGTG 
GGTCGGATGT 
CAGAGATTCT 
AGTTCAGGAA 
TGTTCTCTCA 
TGACTTTGCT 
CACACGCACT 
AAGTTCTTTT 
CATGTTGGAT 
GGCTGTTGGA 
GTCTGATGAA 
TGCAGATGTA 
TGCCTGTGTG 
ACGGGGCTCG 
GTTAAACACC 
GGAGATACTT 
AATCATTGAA 
AATTTGAATA 
GAGGAACAGT 
CTACAGTGAT 
TGGGTACTTA 
TGATATTAAG 
TCTTGAAGAC 
AAATGCCACA 
GGCCTTTTGT 
TATCACACTT 
CTGAAAAAAA 
GGGACAGTCA 
CTCTGGGCTG 
TTCTGGATCT 
AAAAAAAAAA 



11 

i 

TTGGGAGGGC 

GGGGCACTGG 

GCCCCGGGCC 

GACCGGCCCG 

CTGGACCTGA 

GCTGCTGGAG 

GCAGCGGGAA 

AGCTTCTGCC 

AAAATTGGCA 

GGTGGTGGCC 

GACAAGGGTC 

CTGTGGAAAA 

GGTGGAGCAC 

GCAATAGAAC 

CTTCCCAATC 

TTTGCAAAGA 

AC TCCGGTTC 

TTTATCACCA 

CCAGCCATCC 

GAAGAATTTG 

AGTAATAAGG 

CTGCATTTAC 

ATTACAGTGT 

GATTGAATTC 

TACTAAATTA 

ATTCTTGACT 

ATCGATATAC 

AATTGTATGG 

CTTGGTGTTC 

TGTAATTTGC 

CATATCCAAA 

GTTTTAGGGT 

TCAGCTTTCC 

TATACCCAAC 

AAAAAAAAAA 



21 
I 

TTCTTGCAGG 
GAAGCGCCAT 
GTNTCTGTGC 
GCTCCCGCTA 
AGCAGCCGCG 
CCCTTCCGCC 
AATCCAAGGC 
GGTTAGCTGG 
GAAGTGGTGA 
TTATGTGTGC 
AGGTCATTGA 
CTCAGAAATC 
CTTTCTATAC 
CCCAGTTCTA 
AGATGAGCAC 
AGACGAAGGC 
TGACTTTTGA 
GTGAGGAGCA 
CTTCTTCCAA 
GATTCAGCCG 
TAAAAGCTAG 
AGTGTAGAGT 
CCTACCACTC 
TAAAAATGGT 
TGGTAGTTAT 
TATATTTTGA 
ATTTATTTAC 
TGATAAAAGT 
ATGATCTCCC 
AAAGAAAAGT 
ATAATGAGGA 
TGCCTGTATC 
TTTCTCCATG 
ACACAGCAAC 
AAAAAAAA 



31 

i 

CTGCTGGGCT 
GGCACTGCAG 
TATGGTCCTG 
CGACGTGAGC 
GGAGCCGCGT 
GC GGTGTC AT 
TTATTTATGC 
CCACGATATC 
GAATCCGTAT 
ACTGGGCATT 
TGCAAATATG 
GAGTCTGTGG 
GACTTACAGG 
CGAGCTGCTG 
GGATGATTGG 
AGAGTGGTGT 
GGAGGTTGTT 
GGACGTGAGC 
AGGGGATCCT 
AGAAGAGATT 
TCTC TAAC TT 
AACACATAAC 
TAATCAAGAA 
TATCATTAGG 
TCTGCCTTCC 
ATGGGTTCTA 
ACTCTTGATT 
CACGTGAAAC 
TCTAAGCACA 
TTCACCTGTA 
AATGTGTTGG 
CAGTAACTCG 
TGTTTGATTT 
ATCCAGAAAT 



41 

i 

GGGGCTAAGG 
GGCATCTCGG 
GCTGACTTCG 
CGCTTGGGCC 
GCTGCGGCGT 
GGAGAAACTC 
CAGGCTGAGT 
AACTATTTGG 
GCCCCGCTGA 
AT AATGGC TC 
GTGGAAGGAA 
GAAGCACCTC 
ACAGCAGATG 
ATCAAAGGAC 
CCAGAAATGA 
CAAATCTTTG 
CATCATGATC 
CCCCGCCTTG 
TTCATAGGAG 
TATCAGCTTA 
CCAGGCCCAC 
ATTGTATGCA 
AAGAATTACA 
GCTTTTGATT 
AGTTTGCTTG 
GTGAAAAAGG 
CTACAATGTA 
AGAGTGATTG 
TTCCAAACTT 
TTGAATCAGA 
CTCACTACGT 
GGGCCTGTTT 
CTCCTCAGGC 
AAAGATCTCA 



51 

1 

GCTGCTCAGT 
TCGTGGAGCT 
GGGCGCGTGT 
GGGGCAAGCG 
CTGTGCAAGC 
CAGCTGGGCC 
GGATTTGGCC 
CTTTGTCAGG 
ATCTCGTGGC 
TTTTTGACCG 
CAGCATATTT 
GAGGACAGAA 
GGGAATTCAT 
TTGGACTAAA 
AGAAGAAGTT 
ACGGCACAGA 
ACAACAAGGA 
CACCTCTGCT 
AACACACTGA 
ACTCAGATAA 
GGCTCAAGTG 
TGGAAACATG 
GACTCTGATT 
TATAAAACTT 
ATATATTTGT 
AATGATATAT 
GAAAATGAGG 
GTTGCATCCA 
TAGCAACAGT 
ATGCCTTCAA 
AGAGTCCAGA 
CCCCGTGGGT 
TGGTAGCAAG 
GGACCCCCCA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



SEQ ID NO: 76 PDQ1 Protein seouence: 
Protein Accession #: NPJJ55 1 39 

1 11 2 

I f 
1 MALQGISWE LSGLAPGRXC 
61 REPRAAASVQ AVGCAAGALP 
121 GHDINYLALS GVLSKIGRSG 
181 DANMVEGTAY LSSFLWKTQK 
241 YELLIKGLGL KSDELPNQMS 
301 EEWHHDHNK ERGS FITS EE 
361 REEIYQLNSD KIIESNKVKA 



1 31 41 51 

I I I I 

AMVLADFGAR WRVDRPGSR YDVSRLGRGK RSLVLDLKQP 
PRCHGETPAG PRDSAAGKSK AYLCQAEWIW PVQESFCRLA 
ENPYAPLNLV ADFAGGGLMC ALGIIMALFD RTRTDKGQVI 
SSLWEAPRGQ NMLDGGAPFY TTYRTADGEF MAVGAIEPQF 
TDDWPEMKKK FADVFAKKTK AEWCQIFDGT DACVTPVLTF 
QDVSPRLAPL LLNTPAIPSS KGDPFIGEHT EEILEEFGFS 
SL 



60 
120 
180 
240 
300 
360 



SEQ ID NO:77 PD03 DNA SEQUENCE 

Nucleic Acid Accession #: AB02895 1 

Coding sequence: 97-1 1 28 (underlined sequences correspond to start and stop codons) 



I 

GTTAAATCCT 
CTTCACAGAG 
AGAGTCAAAA 
GCAGATTTGG 
GCAAGGCATT 
TTGACTTCGG 



11 
I 

TACTTTACCA 
ACTTGAAACC 
TAGCTGACAT 
ATCCAGTAGT 
ATACAAAGGC 
AACCTATTTT 



21 
! 

GATTCTTGAT 
AGCAAATATC 
GGGTTTTGCC 
TGTGACATTT 
CATTGATATA 
TCACTGTCGT 



31 

1 

GGTATCCATT 
CTAGTAATGG 
AGATTATTCA 
TGGTATCGGG 
TGGGCAATAG 
CAGGAAGATA 



41 
I 

ACCTCCATGC 
GAGAAGGTCC 
ATTCTCCTCT 
CTCCAGAACT 
GTTGTATATT 
TAAAAACAAG 



51 
I 

AAATTGGGTG 
TGAGAGGGGG 
AAAGCCACTA 
TTTGCTTGGT 
TGCTGAATTG 
CAATCCCTTT 



60 
120 
180 
240 
300 
360 



60 
120 
180 
240 
300 
360 
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CATCATGATC AACTGGATCG GATATTTAGT GTCATGGGGT TTCCTGCAGA TAAAG AC TGG 420 
GAAGATATTA GAAAGATGCC AGAATATCCC ACACTTCAAA AAGACTTTAG AAGAACAACG 480 
TATGCCAACA GTAGCCTCAT AAAGTACATG GAGAAACACA AGGTCAAGCC TGACAGCAAA 540 
GTGTTCCTCT TGCTTCAGAA ACTCCTGACC ATGGATCCAA CCAAGAGAAT TACCTCGGAG 600 
5 CAAGCTCTGC AGGATCCCTA TTTTCAGGAG GACCCTTTGC CAACATTAGA TGTATTTGCC 660 

GGCTGCCAGA TTCCATACCC CAAACGAGAA TTCCTTAATG AAGATGATCC TGAAGAAAAA 720 
GGTGACAAGA ATCAGCAACA GCAGCAGAAC CAGCATCAGC AGCCCACAGC CCCTCCACAG 780 
CAGGCAGCAG CCCCTCCACA GGCGCCCCCA CCACAGCAGA ACAGCACCCA GACCAACGGG 840 
ACCGCAGGTG GGGCTGGGGC CGGGGTCGGG GGCACCGGAG CAGGGTTGCA GCACAGCCAG 900 
10 GACTCCAGCC TGAACCAGGT GCCTCCAAAC AAGAAGCCAC GGCTAGGGCC TTCAGGCGCA 960 

AACTCAGGTG GACCTGTGAT GCCCTCGGAT TATCAGCACT CCAGTTCTCG CCTGAATTAC 1020 
CAAAGCAGCG TTCAGGGATC CTCTCAGTCC CAGAGCACAC TTGGCTACTC TTCCTCGTCT 1080 
CAGCAGAGCT CACAGTACCA CCCATCTCAC CAGGCCCACC GGTACTGACC AGCTCCCGTT 1140 
GGGCCAGGCC AGCCCAGCCC AGAGCACAGG CTCCAGCAAT ATGTCTGCAT TGAAAAGAAC 1200 
15 CAAAAAAATG CAAACTATGA TGCCATTTAA AACTCATACA CATGGGAGGA AAACCTTATA 1260 

TACTGAGCAT TGTGCAGGAC TGATAGCTCT TCTTTATTGA CTTAAAGAAG ATTCTTGTGA 1320 
AGTTTCCCCA GCACCCCTTC CCTGCATGTG TTCCATTGTG ACTTCTCTGA TAAAGCGTCT 1380 
GATCTAATCC CAGCACTTCT GTAACCTTCA GCATTTCTTT GAAGGATTTC CTGGTGC AC C 1440 
TTTCTCATGC TGTAGCAATC ACTATGGTTT ATCTTTTCAA AGCTCTTTTA ATAGGATTTT 1500 
20 AATGTTTTAG AAACAGGATT CCAGTGGTGT ATAGTTTTAT ACTTCATGAA CTGATTTAGC 1560 
AACACAGGTA AAAATGCACC TTTTAAAGCA CTACGTTTTC ACAGACAATA ACTGTTCTGC 1620 
TCATGGAAGT CTTAAACAGA AACTGTTACT GTCCCAAAGT ACTTTACTAT TACGTTCGTA 1680 
TTTATCTAGT TTCAGGGAAG GTCTAATAAA AAGACAAGCG GTGGGACAGA GGGAACCTAC 1740 
AACCAAAAAC TGCCTAGATC TTTGCAGTTA TGTGCTTTAT GCCACGAAGA ACTGAAGTAT 1800 
25 GTGGTAATTT TTATAGAATC ATTCATATGG AACTGAGTTC CCAGCATCAT CTTATTCTGA 1860 

ATAGCATTCA GTAATTAAGA ATTACAATTT TAACCTTCAT GTAGCTAAGT CTACCTTAAA 1920 
AAGGGTTTCA AGAGCTTTGT ACAGTCTCGA TGGCCCACAC CAAAACGCTG AAGAGAGTAA 1980 
CAACTGCACT AGGATTTCTG TAAGGAGTAA TTTTGATCAA AAGACGTGTT ACTTCCCTTT 2040 
GAAGGAAAAG TTTTTAGTGT GTATTGTACA TAAAGTCGGC TTCTCTAAAG AACCATTGGT 2100 
30 TTCTTCACAT CTGGGTCTGC GTGAGTAACT TTCTTGCATA ATCAAGGTTA CTCAAGTAGA 2160 
AGCCTGAAAA TTAATCTGCT TTTAAAATAA AGAGCAGTGT TCTCCATTCG TATTTGTATT 2220 
AGATATAGAG TGACTATTTT TAAAGCATGT TAAAAATTTA GGTTTTATTC ATGTTTAAAG 2280 
TATGTATTAT GTATGCATAA TTTTGCTGTT GTTAC TG AAA CTTAATTCTA TCAAGAATCT 2340 
TTTTCATTGC ACTGAATGAT TTCTTTTGCC CCTAGGAGAA AACTTAATAA TTGTGCCTAA 2400 
35 AAACTATGGG CGGATAGTAT AAGACTATAC TAGACAAAGT GAATATTTGC ATTTCCATTA 2460 
TCTATGAATT AGTGGCTGAG TTCTTTCTTA GCTGCTTTAA GGAGCCCCTC ACTCCCCAGA 2520 
GTCAAAAGGA AATGTAAAAA CTTAGAGCTC CCATTGTAAT GTAAGGGGCA AGAAATTTGT 2580 
GTTCTTCTGA ATGCTACTAG CAGCACCAGC CTTGTTTTAA ATGTTTTCTT GAGCTAGAAG 2640 
AAATAGCTGA TTATTGTATA TGCAAATTAC ATGCATTTTT AAAAACTATT CTTTCTGAAC 2700 
40 TTATCTACCT GGTTATGATA CTGTGGGTCC ATACACAAGT AAAATAAGAT TAGACAGAAG 2760 

CCAGTATACA TTTTGCACTA TTGATGTGAT ACTGTAGCCA GCCAGGACCT TACTGATCTC 2820 
AGCATAATAA TGCTCACTAA TAATGAAGTC TGCATAGTGA CACTCATCAA GACTGAAGAT 2880 
GAAGCAGGTT ACGTGCTCCA TTGGAAGGAG TTTCTGATAG TCTCCTGCTG TTTTACCCCT 2940 
TCCATTTTTT AAAATAAGAA ATTAGCAGCC CTCTGCATAA TGTAGCTGCC TATATGCAGT 3000 
45 TTTATCCTGT GCCCTAAAGC CTCACTGTCC AGAGCTGTTG GTCATCAGAT GCTTATTGCA 3060 

CCCTCACCAT GTGCCTGGTG CCCTGCTGGG TAGAGAACAC AGAGGACAGG GCATACTTCT 3120 
TGTCCTTAAG GAGCTTGTGA TCTGTGACAG TAAGCCCTCC TGGGATGTCT GTGCCATGTG 3180 
ATTGACTTAC AAGTGAAACT GTCTTATAAT ATGAAGGTCT TTTTGTTTAC TTCTAAACCC 3240 
ACTTGGGTAG TTACTATCCC CAAATCTGTT CTGTAAATAA TATTATGGAA GGGTTTCTAT 3300 
50 GTCAGTCTAC CTTAGAGAAA GCCAGTGATT CAATATCACA AAAGGCATTG ACGTATCTTT 3360 
GAAATGTTCA CAGCAGCCTT TTAACAACAA CTGGGTGGTC CTTGTAGGCA GAACATACTC 3420 
TCCTAAGTGG TTGTAGGAAA TTGCAAGGAA AATAGAAGGT CTGTTCTTGC TCTCAAGGAG 3480 
GTTACCTTTA ATAAAAGAAG ACAAACCCAG ATAGATATGT AAACCAAAAT ACTATGCCCC 3540 
TTAATACTTT ATAAGCAGCA TTGTTAAATA GTTCTTACGC TTATACATTC ACAGAACTAC 3600 
55 CCTGTTTTCC TTGTATATAA TGACTTTTGC TGGCAGAACT GAAATATAAA CTGTAAGGGG 3660 
ATTTCGTCAG TTGCTCCCAG TATACAATAT CCTCCAGGAC ATAGCCAGAA ATCTCCATTC 3720 
CACACATGAC TGAGTTCCTA TCCCTGCACT GGTACTGGCT CTTTTCTCCT CTTTCCTTGC 3780 
CTCAGGGTTC GTGCTACCCA CTGATTCCCT TTACCCTTAG TAATAATTTT GGATCATTTT 3840 
CTTTCCTTTA AAGGGGAACA AAGCCTTTTT TTTTTTTGAG ACGGAGTGTT GCTCTGTCAC 3900 
60 CCAAGCTGGA GTGCAGTGGC ACGATCTTGG CTCACTCCAA CCTCCACCTT CCAGGTTCAA 3960 
GTGATTCTCC TGCCTCAGCC TCCCGAGTAG CTGGGACTAC GGGCACGCAC CACCACGTCT 4020 
GGCTAATTTT TGTATTTTTA GTAGAGATGG GGTTTCACCC TATTGGTCAG GCTGGTCTTG 408 0 
AATTCCTCAC CTCAGGTCAT CCGCCTGTCT CGGCCTCCCG AAGTGCTGGG ATTATAGGTG 4140 
TGAGCCACCG CACCCAGTTG GGAACAAAGC CTTTTTAACA CACGTAAGGG CCCTCAAACC 4200 
65 GTGGGACCTC TAAGGAGACC TTTGAAGCTT TTTGAGGGCA AACTTTACCT TTGTGGTCCC 4260 

CAAATGATGG CATTTCTCTT TGAAATTTAT TAGATACTGT TATGTCCCCC AAGGGTACAG 4320 
GAGGGGCATC CCTCAGCCTA TGGGAACACC CAAACTAGGA GGGGTTATTG ACAGGAAGGA 4380 
ATGAATCCAA GTGAAGGCTT TCTGCTCTTC GTGTTACAAA CCAGTTTCAG AGTTAGCTTT 4440 
CTGGGGAGGT GTGTGTTTGT GAAAGGAATT CAAGTGTTGC AGGACAGATG AGCTCAAGGT 4500 
70 AAGGTAGCTT TGGCAGCAGG GCTGATACTA TGAGGCTGAA ACAATCCTTG TGATGAAGTA 4560 

GATCATGCAG TGACATACAA AG AC C AAGG A TTATGTATAT TTTTATATCT CTGTGGTTTT 4620 
GAAACTTTAG TACTTAGAAT TTTGGCCTTC TGCACTACTC TTTTGCTCTT ACGAACATAA 4680 
TGGACTCTTA AGAATGGAAA GGGATGACAT TTACCTATGT GTGCTGCCTC ATTCCTGGTG 4740 
AAGCAACTGC TACTTGTTCT CTATGCCTCT AAAATGATGC TGTTTTCTCT GCTAAAGGTA 4800 
75 AAAGAAAAGA AAAAAATAGT TGGAAAATAA GACATGCAAC TTGATGTGCT TTTGAGTAAA 4860 

TTTATGCAGC AGAAACTATA CAATGAAGGA AGAATTCTAT GGAAATTACA AATCCAAAAC 4920 
TC TATGATG A TGTCTTCCTA GGGAGTAGAG AAAGGCAGTG AAATGGCAGT TAGACCAACA 4980 
GAGGC TTGAA GGATTCAAGT ACAAGTAATA TTTTGTATAA AACATAGCAG TTTAGGTCCC 5040 
CATAATCCTC AAAAATAGTC ACAAATATAA CAAAGTTCAT TGTTTTAGGG TTTTTAAAAA 5100 
80 ACGTGTTGTA CCTAAGGCCA TACTTACTCT TCTATGCTAT CACTGCAAAG GGGTGATATG 5160 



330 



TATGTATTAT 
TTAATACTAT 
ATCTGGACTG 
AGTATATCCT 
ACCTGTTCTT 
GAGATGACTG 
AGCGAGGCCT 
TTTTGAGTTG 
CATTTATTTT 



ATAAAAAAAA 
TTAATTTTTT 
AAGGTGTCCT 
TTCTAAACTG 
GTCTCTTTTT 
TAGCTTTTCG 
GCTCCATGGA 
ACCTGACTTC 
ATATTCTTGG 



AAACCCTTAA 
TAAAGATTTG 
TTTTAACAAC 
CCTAGTTTGT 
TCAGTCATTT 
TGCTCCACTG 
GTGCAGGACG 
CTTCTTGAAA 
TTGAAATAAA 



TGCACTGTTA 
TCTGTGTAGA 
AATTTAAAGT 
ATATTCCTAT 
TCTGCACGCA 
CGAGGTTTGT 
AGCTACTGCT 
TGACTGTTAA 
ATTTAATTGA 



TCTCCTAAAT 
CACTAAAAGT 
ACTTTTTATA 
AATTCCTATT 
TCCCCCTTTA 
GCTCAGAGCC 
TTGGAGCGAG 
AACTAAAATA 
CTTTG 



ATTTAGTAAA 
ATTACACAAA 
TATGTTATGT 
TGTGAAGTGT 
TATGGTTATA 
GCTGCACCCC 
GGTTTCCTGC 
AATTACATTG 



5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 



SEQ ID NO;78 PD03 Protein sequence: 
Protein Accession #: BAA82980 



1 11 21 31 

III! 

VKSLLYQILD GIHYLHANWV LHRDLKPANI LVMGEGPERG 
ADLDPWVTF WYRAPELLLG ARHYTKAIDI WAIGCIFAEL 
HHDQLDRIFS VMGFPADKDW EDIRKMPEYP TLQKDFRRTT 
VFLLLQKLLT MDPTKRITSE QALQDPYFQE DPLPTLDVFA 
GDKNQQQQQN QHQQPTAPPQ QAAAPPQAPP PQQNSTQTNG 
DSSLNQVPPN KKPRLGPSGA NSGGPVMPSD YQHSSSRLNY 
QQSSQYHPSH QAHRY 



41 51 
I I 

RVKIADMGFA RLFNSPLKPL 60 

LTSEPIFHCR QEDIKTSNPF 120 

YANSSLIKYM EKHKVKPDSK 180 

GCQIPYPKRE FLNEDDPEEK 240 

TAGGAGAGVG GTGAGLQHSQ 300 

QSSVQGSSQS QSTLGYSSSS 360 



SEQ ID KO:79 PD05 DMA SEQUENCE 

Nucleic Acid Accession #: XM_002922 

Coding sequence: 1-2190 (underlined sequences correspond to start and stop codons) 



ATGAATCCTT 
GAGGTACCAC 
AACTATCCAC 
TATGGAATGA 
ACCTCCACAT 
GCAGCCATTG 
TATGTGCTTG 
GTACACACAG 
AAACCCTGTG 
ACTAGATACT 
ATCACACCCA 
TTTGGAG TTC 
ATATACAATA 
TTTGCTATTT 
CTAGACTGGG 
AGGGTACTAT 
TCACGATGGA 
CCGGACCAGA 
TTTGTCATTT 
GCTGTTGGTA 
ATAAATGAAA 
CTGGCAGATG 
GAGTCCATCA 
AGCCAGGATT 
GTGCAGGAGA 
ATGATGGTAA 
AACACTTTGC 
GAAGACTATG 
TGTAGAACAG 
TATCTGTTTG 
ATTCCAGCCA 
GGGGAGGTCA 
ATGAAATCTG 
CTTGTTGTGG 
CTCCTGCTGG 
ACAGAGGATA 
AAACTAGAGA 



11 

I 

TCCAGAAAAA 
CTCGACCACC 
TGAGCATTGC 
AAGCTGTGCT 
CTATATACCA 
CTGACTCGTG 
GCCATGTGAT 
TCCTATCATT 
TGGCAGCTTT 
TCTCAGTCTT 
TGCTGAGAGG 
CAGGACTGCT 
AACCACCCCC 
CCAATCGTTT 
CAGCTGAGAA 
TCCTTTATAT 
CTTTGCAAGC 
TGCAGGTTCT 
ATCGTCTGGT 
TGATCCTAGC 
TGGCCCCAGC 
ATGAGGTGAA 
AATCCTTTCA 
TTCACTTCCA 
AGAACTGGTA 
AGGATACAGA 
ATAAAGATGT 
GTGTGTCTGC 
AAGATAAGAA 
TTATTACTAA 
ACAAAATGTC 
TGTTCTCTGT 
TGCTCCAGGC 
CACAGTTCAG 
TGATCTGCCT 
TGCGGGGTCC 
CCAAGAAGAC 



21 
t 

TGAGTCCAAG 
TAGCCCTCCA 
CTTCATTGTG 
GATCCTGTAT 
TGCCTTCAGC 
GTTGGGAAAA 
CAAGTCCTTG 
GATCGGCCTG 
TGGTGGAGAC 
CTACCTGTCC 
AGATGTGCAA 
CATGGTAATT 
TGAAGGAAAC 
CAAGAACCGT 
ATATCCAAAG 
CCCATTGCCC 
CATCAGGATG 
AAATCCCTTT 
CTCCAAGTGT 
GTGCCTGGCA 
CCAGTCAGGT 
GGTGACAGTG 
GAAAACACCA 
CCTGAAATAT 
CAGTCTTGTC 
AAGCAAAACA 
CAACATCTCC 
TTATAGAACT 
CTTTTCTCTG 
TAACACCAAT 
CATTGCGTGG 
CACAGGTCTT 
AGCTTGGCTA 
TGGCCTGGTA 
GATCTTCTCC 
AGCAGATAAG 
AAAACTCTGA 



31 

i 

GAAACTCTTT 
AAGAAGCCAT 
GTGAATGAAT 
TTCCTGTATT 
AGCCTCTGTT 
TTCAAGACAA 
GGTGCCTTAC 
AGTCTAATAG 
CAGTTTGAAG 
ATCAATGCAG 
TGTTTTGGAG 
GCACTTGTTG 
ATAGTGGCTC 
TCTGGAGACA 
CAGCTCATTA 
ATGTTCTGGG 
AATAGGAATT 
CTGGTTCTTA 
GG AATTAACT 
TTTGCAGTTG 
CCCCAGGAGG 
GTGGGAAATG 
CACTATTCCA 
CACAATTTGT 
ATTCGTGAAG 
ACCAATGGGA 
CTGAGTACAG 
GTGCAAAGAG 
AATTTGGGTC 
CAGGGTCTTC 
CAGCTACCAC 
GAGTTTTCTT 
TTGACAATTG 
CAGTGGGCCG 
ATCATGGGCT 
CACATTCCTC 



41 

I 

TTTCACCTGT 
CTCCGACAAT 
TCTGCGAGCG 
TCCTGCACTG 
ATTTTACTCC 
TCATCTATCT 
CAATACTGGG 
CTTTGGGGAC 
AAAAACATGC 
GGAGCTTGAT 
AAGACTGCTA 
TGTTTGCAAT 
AAGTTTTCAA 
TTCCAAAGCG 
TGGATGTAAA 
CTCTTTTGGA 
TGGGGTTTTT 
TCTTCATCCC 
TCTCATCACT 
CGGCAGCTGT 
TTTTCCTACA 
AAAACAATTC 
AACTGCACCT 
CTCTCTACAC 
ATGGGAACAG 
TGACAACCGT 
ATACCTCTCT 
GAGAATACCC 
TTCTAGACTT 
AGGCCTGGAA 
AATATGCCCT 
ATTCTCAGGC 
CAGTTGGGAA 
AATTCATTTT 
ACTACTATGT 
ACATCCAGGG 



51 
! 

CTCCATTGAA 
CTGTGGCTCC 
CTTTTCCTAT 
GAATGAAGAT 
CATCCTGGGA 
CTCCTTGGTG 
AGGACAAGTG 
AGGAGGCATC 
AGAGGAACGG 
TTCTACATTT 
TGCATTGGCT 
GGGAAGCAAA 
ATGTATCTGG 
ACAGCACTGG 
GGCACTGACC 
TCAGCAGGGT 
TGTGCTTCAG 
GTTGTTTGAC 
TAGGAAAATG 
AGAGATAAAA 
AGTCTTGAAT 
TCTGTTGATA 
GAAAACAAAA 
TGAGCATTCT 
TATCTCCAGC 
GAGGTTTGTT 
CAATGTTGGT 
TGCAGTGCAC 
TGGTGCAGCA 
GATTGAAGAC 
GGTTACAGCT 
TCCCTCTAGC 
TATCATCGTG 
GTTTTCCTGC 
TCCTGTAAAG 
GAACATGATC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 



SEQ ID NO:80 PD05 Protein sequence: 
Protein Accession #: XP_002922 



MNPFQKNESK 
YGMKAVLILY 
YVLGHVIKSL 
TRYFSVFYLS 
IYNKPPPEGN 
RVLFLYIPLP 
FVIYRLVSKC 
LADDEVKVTV 



11 

t 

ETLFSPVSIE 
FLYFLHWNED 
GALPILGGQV 
INAGSLXSTF 
IVAQVFKCIW 
MFVJALLDQQG 
GINFSSLRKM 
VGNENNSLLI 



21 
I 

EVPPRPPSPP 
TSTSIYHAFS 
VHTVLSLIGL 
ITPMLRGDVQ 
FAISNRFKNR 
SRWTLQAIRM 
AVGMI LACLA 
ESIKSFQKTP 



31 

! 

KKPSPTICGS 
SLCYFTPILG 
SLIALGTGGI 
CFGEDCYALA 
SGDIPKRQHW 
NRNLGFFVLQ 
FAVAAAVEIK 
HYSKLHLKTK 



41 

I 

NYPLSIAFIV 
AAIADSWLGK 
KPCVAAFGGD 
FGVPGLLMVI 
LDWAAEKYPK 
PDQMQVLNPF 
INEMAPAQSG 
SQDFHFHLKY 



51 

[ 

VNEFCERFSY 
FKTHYLSLV 
QFEEKHAEER 
ALWFAMGSK 
QLIMDVKALT 
LVLIFIPLFD 
PQEVFLQVLN 
HNLSLYTEHS 



60 
120 
180 
240 
300 
360 
420 
480 
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10 



55 

60 
65 
70 
75 
80 



VQEKNWYSLV IREDGNSISS MMVKDTESKT TNGMTTVRFV NTLHKDVNIS LSTDTSLNVG 540 
EDYGVSAYRT VQRGEYPAVH CRTEDKNFSL NLGLLDFGAA YLFVITNNTN QGLQAWKIED 600 
IPANKHSIAW QLPQYALVTA GEVMFSVTGL EFSYSQAPSS MKSVLQAAWL LT1AVGN11V 66 0 
LWAQFSGLV QWAEFILFSC LLLVICLIFS IMGYYWFVK TEDMRGPADK HIPHIQGNMI 720 
KLETKKTKL 

SEQ ID N0:8T P006 DMA SEQUENCE 

Nucleic Acid Accession #: NM_020448 

Coding sequence: 1-1221 (underlined sequences correspond to start and stop codons) 



l 11 21 31 41 51 

I I I I I I 

ATGGACGGAT CCCACAGCGC AGCCCTGAAG CTGCAGCAGC TGCCTCCCAC AAGTAGCTCC 60 

AGCGCCGTAA GCGAGGCCTC CTTCTCCTAC AAGGAAAACC TGATTGGCGC CCTCTTGGCG 120 

15 ATCTTCGGGC ACCTCGTGGT CAGCATTGCA CTTAACCTCC AGAAGTACTG CCACATCCGC 180 

CTGGCAGGCT CCAAGGATCC CCGGGCCTAT TTCAAGACCA AGACATGGTG GCTGGGCCTG 240 

TTCCTGATGC TTCTGGGCGA GCTGGGTGTG TTCGCCTCCT ACGCCTTCGC GCCGCTGTCA 300 

CTCATCGTGC CCCTCAGCGC AGTTTCTGTG ATAGCTAGTG CCATCATAGG AATCATATTC 360 

_ ATCAAGGAAA AGTGGAAACC GAAAGACTTT CTGAGGCGCT ACGTCTTGTC CTTTGTTGGC 420 

20 TGCGGTTTGG CTGTCGTGGG TACCTACCTG CTGGTGACAT TCGCACCCAA CAGTCACGAG 480 

AAGATGACAG GCGAGAATGT CACCAGGCAC CTCGTGAGCT GGCCTTTCCT TTTGTACATG 540 

CTGGTGGAGA TCATTCTGTT CTGCTTGCTG CTCTACTTCT ACAAGGAGAA GAACGCCAAC 600 

AACATTGTCG TGATTCTTCT CTTGGTGGCG TTACTTGGCT CCATGACAGT GGTGACAGTC 660 

_ AAGGCCGTGG CTGGGATGCT TGTCTTGTCC ATTCAAGGGA ACCTGCAGCT TGACTACCCC 720 

25 ATCTTCTACG TGATGTTCGT GTGCATGGTG GCAACCGCCG TCTATCAGGC TGCGTTTTTG 7 80 

J. AGTCAAGCCT CACAGATGTA CGACTCCTCT TTGATTGCCA GTGTGGGCTA CATTCTGTCC 840 

r ACAACCATTG CTATCACAGC AGGTGCAATA TTTTACCTGG ACTTCATCGG GGAGGACGTG 900 

■* CTGCACATCT GCATGTTTGC ACTGGGGTGC CTCATTGCAT TCTTGGGCGT CTTCTTAATC 960 

: ACGCGTAACA GGAAGAAGCC CATTCCATTT GAGCCCTATA TTTCCATGGA TGCCATGCCA 1020 

T30 GGTATGCAGA ACATGCACGA TAAAGGGATG ACTGTCCAGC CTGAACTTAA AGCTTCTTTT 1080 

TCCTATGGGG CTCTGGAAAA CAATGACAAC ATTTCTGAGA TCTACGCTCC TGCCACCCTG 1140 

CCAGTCATGC AAGAAGAGCA CGGCTCCAGA AGTGCCTCTG GGGTCCCCTA CCGAGTCCTA 1200 
1 GAGCACACCA AGAAGGA ATG A 

= 35 SEQ ID NO:82 PD06 Protein sequence 
Protein Accession #: NP_065 1 8 1 

1 11 21 31 41 51 

Af\ 1 1 ' { 1 1 

;;4U MDGSHSAALK LQQLPPTSSS SAVSEASFSY KENLIGALLA IFGHLWSIA LNLQKYCHIR 60 

*" LAGSKDPRAY FKTKTWWLGL FLMLLGELGV FASYAFAPLS LIVFLSAVSV IASAIIGIIF 120 

IKEKWKPKDF LRRYVLSFVG CGLAWGTYL LVTFAPNSHE KMTGENVTRH LVSWPFLLYM 180 

LVEIILFCLL LYFYKEKNAN NIWTLLLVA LLGSMTWTV KAVAGMLVLS IQGNLQLDYP 240 

: IFYVMFVCMV ATAVYQAAFL SQASQMYDSS LIASVGYILS TTIAITAGAI FYLDFIGEDV 300 

45 LHICMFALGC LIAFLGVFLI TRNRKKPIPF EPYISMDAMP GMQNMHDKGM TVQPELKASF 360 
SYGALENNDN ISEIYAPATL PVMQEEHGSR SASGVPYRVL EHTKKE 

SEQ ID NO:83 PD08 DNA SEQUENCE 

Nucleic Acid Accession #: NMJJ32712 
50 Cotfng sequence: 555-908 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I 1 I I I 

CACTCATTAA GAACAGAGGA GGCTGCCTGT TACTCCTGGT GTTGCATCCC TCCAGACACT 60 

CTGCTGTTTC CTGCCTAGGC GTGGCTGCAG CCATGGCTAG GAAAGCGCTG CCACCCACCC 120 

ACCTGGGCCA GAGCTGGTTC TGCTCCTGCT GCAGGGACAC TGAGCTGGCT ATCTCGGCGC 180 

TTCGGGCAAQ AACTGCAACA GGCTCTCCTG GGTCCTGCAG GTGTACAGCC GGGCCCCTGC 240 

CTTGTGCCTC AGCTCTCGAG AGCTGCTGCT GCCGGGTGAC CTGATCCAAC CTGATAAGGT 300 

GCCATCTTCA GCTACCACTG CAAGGCCCTG AGGGCAACAG CAGCACGGCA CTGCCCACCC 360 

GGCTGCTGAT GGCCTGGTGC CAGCTGGGAG TCCTCCCGGC ACTTCGAGGC CACTGAGCCA 420 

CCCTTCCAGC CCCAGCCCAC CATGGACAGG GGTATCCAGC TTCCTCCTCA ACCTCGTCCT 480 

CTGCCCCTGA GCCAGTGACG CCCAAGGACA TGCCTGTTAC CCAGGTCCTG TACCAGCACT 540 

AGCTGGTCAA GGGCATGACA GTGCTGGAGG CCGTCTTGGA GATCCAGGCC ATCACTGGCA 600 

GCAGGCTGCT CTCCATGGTG CCAGGGCCCG CCAGGCCACC AGGCTCATGC TGGGACCCAA 660 

CCCAGTGCAC AAGGACTTGG CTGCTGAGCC ACACACCCAG GAGAAGGTGG ATAAGTGGGC 720 

TACCAAGGGC TTCCTGCAGG CTAGGGGAGG AGCCACCCCC GCTTCCCTAT TGTGACCAGG 780 

CCTATGGGGA GGAGCTGTCC ATACGCCACC GTGAGACCTG GGCCTGGCTC TCAAGGACAG 840 

ACACCGCCTG GCCTGGTGCT CCAGGGGTGA AGCAGGCCAG AATCCTGGGG GAGCTGCTCC 900 

TGGTTTGAGC TGCATTCAGG AAGTGCGGGA CATGGTAGGG GAGGCAAAAA GCCTTGGGCA 9 60 

CTACCCTCCC TGTGGAGCTG TTCGGTGTCC GTCGAGCTAG CCACACCCTG ACACCATGTT 1020 

CAAGGGTACC GGAAGAGAAG GGTGTCTGCC CCCAACCTCC CCTGTGGGTG TCACTGGCCA 1080 

GATGTCATGA GGGAAGCAGG CCTTGTGAGT GGACACTGAC CATGAGTCCC TGGGGGGAGT 1140 

GATCCCCCAG GCATCGTGTG CCATGTTGCA CTTCTGCCCA GGCAGCAGGG TGGGTGGGTA 1200 

CCATGGGTGC CCACCCCTCC ACCACATGGG GCCCCAAAGC ACTGCAGGCC AAGCAGGGCA 1260 
ACCCCACACC CTTGACATAA AAGCATCTTG AAGCTTTTAA AAAAAAAAAA AAAAAA 



SEQ ID NO:84 PD08 Protein sequence 
Protein Accession #: NP_l 16101 



31 



332 



MTVLEAVLEI QAITGSRLLS MVPGPARPPG SCWDPTQCTR TWLLSHTPRR RWISGLPRAS 60 
CRLGEEPPPL PYCDQAYGEE LSIRHRETWA WLSRTDTAWP G AFGVKQAR I LGELLLV 

SEQ ID NO;85 PDT1 DNA SEQUENCE 

Nucleic Acid Accession #: NM_0OO693 

Coding sequence: 53-1 591 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I 1 I I 

AGCCGGTGCG CCGCAGACTA GGGCGCCTCG GGCCAGGGAG CGCGGAGGAG CCATGGCCAC 60 

CGCTAACGGG GCCGTGGAAA ACGGGCAGCC GGACGGGAAG CCGCCGGCCC TGCCGCGCCC 120 

CATCCGCAAC CTGGAGGTCA AGTTCACCAA GATATTTATC AACAATGAAT GGCACGAATC 180 

CAAGAGTGGG AAAAAGTTTG CTACATGTAA CCCTTCAACT CGGGAGCAAA TATGTGAAGT 240 

GGAAGAAGGA GATAAGCCCG ACGTGGACAA GGCTGTGGAG GCTGCACAGG TTGCCTTCCA 300 

GAGGGGCTCG CCATGGCGCC GGCTGGATGC CCTGAGTCGT GGGCGGCTGC TGCACCAGCT 360 

GGCTGACCTG GTGGAGAGGG ACCGCGCCAC CTTGGCCGCC CTGGAGACGA TGGATACAGG 420 

GAAGCCATTT CTTCATGCTT TTTTCATCGA CCTGGAGGGC TGTATTAGAA CCCTCAGATA 480 

CTTTGCAGGG TGGGCAGACA AAATCCAGGG CAAGACCATC CCCACAGATG ACAACGTCGT 540 

ATGCTTCACC AGGCATGAGC CCATTGGTGT CTGTGGGGCC ATCACTCCAT GGAACTTCCC 600 

CCTGCTGATG CTGGTGTGGA AGCTGGCACC CGCCCTCTGC TGTGGGAACA CCATGGTCCT 660 

GAAGCCTGCG GAGCAGACAC CTCTCACCGC CCTTTATCTC GGCTCTCTGA TCAAAGAGGC 720 

CGGGTTCCCT CCAGGAGTGG TGAACATTGT GCCAGGATTC GGGCCCACAG TGGGAGCAGC 780 

AATTTCTTCT CACCCTCAGA TCAACAAGAT CGCCTTCACC GGCTCCACAG AGGTTGG AAA 840 

ACTGGTTAAA GAAGCTGCGT CCCGGAGCAA TCTGAAGCGG GTGACGCTGG AGCTGGGGGG 900 

GAAGAACCCC TGCATCGTGT GTGCGGAOGC TGACTTGGAC TTGGCAGTGG AGTGTGCCCA 960 

TCAGGGAGTG TTCTTCAACC AAGGCCAGTG TTGCACGGCA GCCTCCAGGG TGTTCGTGGA 1020 

GGAGCAGGTC TACTCTGAGT TTGTCAGGCG GAGCGTGGAG TATGCCAAGA PACGGCCCGT 1080 

GGGAGACCCC TTCGATGTCA AAACAGAACA GGGGCCTCAG ATTGATCAAA AGCAGTTCGA 1140 

CAAAATCTTA GAGCTGATCG AGAGTGGGAA GAAGGAAGGG GCCAAGCTGG AATGCGGGGG 1200 

CTCAGCCATG GAAGACAAGG GGCTCTTCAT CAAACCCACT GTCTTCTCAG AAGTCACAGA 1260 

CAACATGCGG ATTGCCAAAG AGGAGATTTT CGGGCCAGTG CAACCAATAC TGAAGTTCAA 1320 

AAGTATCGAA GAAGTGATAA AAAGAGCGAA TAGCACCGAC TATGGACTCA CAGCAGCCGT 1380 

GTTCACAAAA AATCTCGACA AAGCCCTGAA GTTGGCTTCT GCCTTAGAGT CTGGAACGGT 1440 

CTGGATCAAC TGCTACAACG CCCTCTATGC ACAGGCTCCA TTTGGTGGCT TTAAAATGTC 1500 

AGGAAATGGC AGAGAACTAG GTGAATACGC TTTGGCCGAA TACACAGAAG TGAAAACTGT 1560 

CACCATCAAA CTTGGCGACA AGAACCC CTG A AGGAAAGGC GGGGCTCCTT CCTCAAACAT 1620 

CGGACGGCGG AATGTGGCAG ATGAAATGTG CTGGAGGAAA AAAATGACAT TTCTGACCTT 1680 

CCCGGGACAC ATTCTTCTGG AGGCTTTACA TCTACTGGAG TTGAATGATT GCTGTTTTCC 1740 

TCTCACTCTC CTGTTTATTC ACCAGACTGG GGATGCCTAT AGGTTGTCTG TGAAATCGCA 1800 

GTCCTGCCTG GGGAGGGAGC TGTTGGCCAT TTCTGTGTTT CCCTTTAAAC CAGATCCTGG 1860 

AGACAGTGAG ATACTCAGGG CGTTGTTAAC AGGGAGTGGT ATTTGAAGTG TCCAGCAGTT 1920 

GCTTGAAATG CTTTGCCGAA TCTGACTCCA GTAAGAATGT GGGAAAACCC CCTGTGTGTT 1980 

CTGCAAGCAG GGCTCTTGCA CCAGCGGTCT CCTCAGGGTG GACCTGCTTA CAGAGCAAGC 2040 

CACGCCTCTT TCCGAGGTGA AGGTGGGACC ATTCCTTGGG AAAGGATTCA CAGTAAGGTT 2100 

TTTTGGTTTT TGTTTTTTGT TTTCTTGTTT TTAAAAAAAG GATTTCACAG TGAGAAAGTT 2160 

TTGGTTAGTG CATACCGTGG AAGGGCGCCA GGG TCTTTGT GGATTGCATG TTGACATTGA 2220 

CCGTGAGATT CGGCTTCAAA CCAATACTGC CTTTGGAATA TGACAGAATC AATAGCCCAG 2280 

AGAGCTTAGT CAAAGACGAT ATCACGGTCT ACCTTAACCA AGGCACTTTC TTAAGCAGAA 2340 

AATATTGTTG AGGTTACC TT TGCTGCTAAA GATCCAATCT TCTAACGCCA CAACAGCATA 2400 

GCAAATCCTA GGATAATTCA CCTCCTCATT TGACAAATCA GAGCTGTAAT TCACTTTAAC 2460 

AAATTACGCA TTTCTATCAC GTTCACTAAC AGCTTATGAT AAGTCTGTGT AGTCTTCCTT 2520 

TTCTCCAGTT CTGTTACCCA ATTTAGATTA GTAAAGCGTA CACAACTGGA AAGACTGCTG 2580 

TAATAACACA GCCTTGTTAT TTTTAAGTCC TATTTTGATA TTAATTTCTG ATTAGTTAGT 2640 

AAATAACACC TGGATTCTAT GGAGGACCTC GGTCTTCATC CAAGTGGCCT GAGTATTTCA 2700 

CTGGCAGGTT GTGAATTTTT CTTTTCCTCT TTGGGAATCC AAATGATGAT GTGCAATTTC 2760 

ATGTTTTAAC TTGGGAAACT GAAAGTGTTC CCATATAGCT TCAAAAACAA AAACAAATGT 2S20 

GTTATCCGAC GGATACTTTT ATGGTTACTA ACTAGTACTT TCCTAATTGG GAAAGTAGTG 2880 

CTTAAGTTTG CAAATTAAGT TGGGGAGGGC AATAATAAAA TGAGGGCCCG TAACAGAACC 2940 

AGTGTGTGTA TAACGAAAAC CATGTATAAA ATGGGCCTAT CACCCTTGTC AGAGATATAA 3000 

ATT AC C AC AT TTGGCTTCCC TTCATCAGCT AACACTTATC ACTTATACTA CCAATAACTT 3060 

GTTAAATCAG GATTTGGCTT CATACACTGA ATTTTCAGTA TTTTATCTCA AGTAGATATA 3120 

GACACTAACC TTGATAGTGA TACGTTAGAG GGTTCCTATT CTTCCATTGT ACGATAATGT 3180 

CTTTAATATG AAATGCTACA TTATTTATAA TTGGTAGAGT TATTGTATCT TTTTATAGTT 3240 

GTAAGTACAC AGAGGTGGTA TATTTAAACT TCTGTAATAT ACTGTATTTA GAAATGGAAA 3300 

TATATATAGT GTTAGGTTTC ACTTCTTTTA AGGTTTACCC CTGTGGTGTG GTTTAAAAAT 3360 

CTATAGGCCT GGGAATTCCG ATCCTAGCTG CAGATCGCAT CCCACAATGC GAGAATGATA 3420 
AAATAAAATT GGATATTTGA GA 

SEP ID NO:86 PDT1 PROTEIN SEQUENCE 

Protein Accession #: NP_000684 

1 11 21 31 41 51 

I I i I I I 

MATANGAVEN GQPDGKPPAL PRPIRNLEVK FTKIFINNEW HESKSGKKFA TCNPSTREQI 60 

CEVEEGDKPD VDKAVEAAQV AFQRGSPWRR LDALSRGRLL HQLADLVERD RATLAALETM 120 

DTGKPFLHAF FIDLEGCIRT LRYFAGWADK IQGKTIPTDD NWCFTRHEP IGVCGAITPW 180 

NFPLLMLVWK LAPALCCGNT MVLKPAEQTP LTALYLGSLI KEAGFPPGW NIVPGFGPTV 240 

GAAISSHPQI 2JKIAFTGSTE VGKLVKEAAS RSKLKRVTLE LGGKNPCIVC ADADLDLAVE 300 
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CAHQGVFFNQ GQCCTAASRV FVEEQVYSEF VRRSVEYAKK RPVGDPFDVK TEQGPQIDQK 360 
QFDKILELIE SGKKEGAKLE CGGSAMEDKG LFIKPTVFSE VTDNMRIAKE EIFGPVQPIL 420 
KFKSIEEVIK RANSTDYGLT AAVFTKNLDK ALKLASALES GTVWINCYNA LYAQAPFGGF 480 
KMSGNGRELG EYALAEYTEV KTVTIKLGDK NP 

SEQ ID NO:87 PDV3 DNA SEQUENCE 

Nucleic Acid Accession #; NM_032642 

Coding sequence: 1 84-1 263 (underlined sequences correspond to start and stop codons} 



1 11 21 31 41 51 

) i I I I I 

GACCATTAGC AGGCACCCAG GCCTGTCTTT GGCTCGGAAA CGGTGGCCCC CAATGTAGCC 60 

TAGTTTGAAC CTAGGAACTG CAGGACCAGA GAGATTCCAC TGGAGCCTGA TGGACGGGTG 120 

ACAGAGGGAA CCCTACTCTG GAAACTGTCA GTCCCAGGGC ACTGGGGAGG GCTGAGGCCG 180 

ACCATGCCCA GCCTGCTGCT GCTGTTCACG GCTGCTCTGC TGTCCAGCTG GGCTCAGCTT 240 

CTGACAGACG CCAACTCCTG GTGGTCATTA GCTTTGAACC CGGTGCAGAG ACCCGAGATG 300 

TTTATCATCG GTGCCCAGCC CGTGTGCAGT CAGCTTCCCG GGCTCTCCCC TGGCCAGAGG 360 

AAGCTGTGCC AATTGTACCA GGAGCACATG GCCTACATAG GGGAGGGAGC CAAGACTGGC 420 

ATCAAGGAAT GCCAGCACCA GTTCCGGCAG CGGCGGTGGA ATTGCAGCAC AGCGGACAAC 480 

GCATCTGTCT TTGGGAGAGT CATGCAGATA GGCAGCCGAG AGACCGCCTT CACCCACGCG 540 

GTGAGCGCCG CGGGCGTGGT CAACGCCATC AGCCGGGCCT GCCGCGAGGG CGAGCTCTCC 600 

ACCTGCGGCT GCAGCCGGAC GGCGCGGCCC AAGGACCTGC CCCGGGACTG GCTGTGGGGC 660 

GGCTGTGGGG ACAACGTGGA GTACGGCTAC CGCTTCGCCA AGGAGTTTGT GGATGCCCGG 720 

GAGC GAG AG A AGAAC TTTGC CAAAGGATCA GAGGAGCAGG GCCGGGTGCT CATGAACCTG "780 

25 CAAAACAACG AGGCCGGTCG CAGGGCTGTG TATAAGATGG CAGACGTAGC CTGCAAATGC 840 

CACGGCGTCT CGGGGTCCTG CAGCCTCAAG ACCTGCTGGC TGCAGCTGGC CGAGTTCCGC 900 

AAGGTCGGGG ACCGGCTGAA GGAGAAGTAC GACAGCGCGG CCGCCATGCG CGTCACCCGC 960 

AAGGGCCGGC TGGAGCTGGT CAACAGCCGC TTCACCCAGC CCACCCCGGA GGACCTGGTC 1020 

TATGTGGACC CCAGCCCCGA CTACTGCCTG CGCAACGAGA GCACGGGCTC CCTGGGCACG 1080 

30 CAGGGCCGCC TCTGCAACAA GACCTCGGAG GGCATGGATG GCTGTGAGCT CATGTGCTGC 1140 

GGGCGTGGCT ACAACCAGTT CAAGAGCGTG CAGGTGGAGC GCTGCCACTG CAAGTTCCAC 1200 

TGGTGCTGCT TCGTCAGGTG TAAGAAGTGC ACGGAGATCG TGGACCAGTA CATC TGTAAA 1260 

TAGCCCGGAG GGCCTGCTCC CGGCCCCCCC TGCACTCTGC CTCACAAAGG TCTATATTAT 1320 

ATAAATCTAT ATAAATCTAT TTTATATTTG TATAAGTAAA TGGGTGGGTG CTATACAATG 1380 

GAAAGATGAA AATGGAAAGG AAGAGCTTAT TTAAGAGACG CTGGAGATCT CTGAGGAGTG 1440 

GACTTTGCTG GTTCTCTCCT CTTGGTGGGT GGGAGACAGG GCTTTTTCTC TCCCTCTGGC 1500 

GAGGACTCTC AGGATGTAGG GACTTGGAAA TATTTACTGT CTGTCCACCA CGGCCTGGAG 1560 

GAGGGAGGTT GTGGTTGGAT GGAGGAGATG ATCTTGTCTG GAAGTCTAGA GTCTTTGTTG 1620 

GTTAGAGGAC TGCCTGTGAT CCTGGCCACT AGGCCAAGAG GCCC TATGAA GGTGGCGGGA 1680 

ACTCAGCTTC AACCTCGATG TCTTCAGGGT CTTGTCCAGA ATGTAGATGG GTTCCGTAAG 1740 

AGGCCTGGTG CTCTCTTACT CTTTCATCCA CGTGCACTTG TGCGGCATCT GCAGTTTACA 1800 

GGAACGGCTC CTTCCCTAAA ATGAGAAGTC CAAGGTCATC TCTGGCCCAG TGACCACAGA I860 

GAGATCTGCA CCTCCCGGAC TTCAGGCCTG CCTTTCCAGC GAGAATTCTT CATCCTCCAC 1920 

GGTTCACTAG CTCCTACCTG AAGAGGAAAG GGGGCCATTT GACCTGACAT GTCAGGAAAG 1980 

CCCTAAACTG AATGTTTGCG CCTGGGCTGC AGAAGCCAGG GTGCATGACC AGGCTGCGTG 2040 

GACGTTATAC TGTCTTCCCC CACCCCCGGG GAGGGGAAGC TTGAGCTGCT GCTGTCACTC 2100 

CTCCACCGAG GGAGGCCTCA CAAACCACAG GACGCTGCAA CGGGTCAGGC TGGCGGGCCC 2160 

GGCGTGCTCA TCATC TCTGC CCCAGGTGTA CGGTTTCTCT CTGACATTAA ATGCCCTTCA 2220 
TGGAAAAAAA AAAAAGAAAA AAAAAAAAAA AA 



SEQ ID NO:88 PDV3 Protein sequence 
Protein Accession #: NP_U6031 



1 11 21 31 41 51 

55 | | | | | | 

MPSLLLLFTA ALLSSWAQLL TDANSWWSLA LNPVQRPEMF IIGAQPVCSQ LPGLSPGQRK 60 

LCQLYQEHMA YIGEGAKTGI KECQHQFRQR RWNCSTADNA SVFGRVMQIG SRETAFTHAV 120 

SAAGWNAIS RACREGELST CGCSRTARPK DLPRDWLWGG CGDNVEYGYR FAKE FVD ARE 180 

REKNF AKGSE EQGRVhMNLQ NNEAGRRAVY KMADVACKCH GVSGSCSLKT CWLQLAEFRK 240 

OU VGDRLKEKYD SAAAMRVTRK GRLELVNSRF TQPTPEDLVY VDPSPDYCLR NESTGSLGTQ 300 
GRLCNKTSEG MDGCELMCCG RGYNQFKSVQ VERCHCKFHW CCFVRCKKCT EIVDQYICK 

SEQ ID NO:89 PDT9 DNA SEQUENCE 

Nucleic Acid Accession #: NM_033280 
O 5 Coding sequence: 58-636 (underii ned sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I i I i 

GGCAGCCGTC TGTGCCACCC AGAGCCGGCG GGCCGCTAGG TCCCCGGAGA CCCTGCT ATG 60 

GTGCGTGCGG GCGCCGTGGG GGCTCATCTC CCCGCGTCCG GCTTGGATAT CTTCGGGGAC 120 

CTGAAGAAGA TGAACAAGCG CCAGCTCTAT TACCAGGTTT TAAACTTCGC CATGATCGTG 180 

TCTTCTGCAC TCATGATATG GAAAGGCTTG ATCGTGCTCA CAGGCAGTGA GAGCCCCATC 240 

GTGGTGGTGC TGAGTGGCAG TATGGAGCCG GCCTTTCACA GAGGAGACCT CCTGTTCCTC 300 

ACAAATTTCC GGGAAGACCC AATCAGAGCT GGTGAAATAG TTGTTTTTAA AGTTGAAGGA 360 

CGAGACATTC CAATAGTTCA CAGAGTAATC AAAGTTCATG AAAAAGATAA TGGAGACATC 420 

AAATTTCTGA CTAAAGGAGA TAATAATGAA GTTGATGATA GAGGCTTGTA CAAAGAAGGC 480 

CAGAACTGGC TGGAAAAGAA GGACGTGCTG GGAAGAGCAA GAGGGTTTTT ACCATATGTT 540 

GGTATGGTCA CCATAATAAT GAATGACTAT CCAAAATTCA AGTATGCTCT TTTGGCTGTA 600 

ATGGGTGCAT ATGTGTTACT AAAACGTGAA TCCTAAAATG AGAAGCAGTT CCTGGGACCA 660 

GATTGAAATG AATTCTGTTG AAAAAGAGAA AAACTAATAT ATTTGAGATG TTCCATTTTC 720 
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TGTATAAAAG GG AACAGTGT GGAGATGTTT TTGTCTTGTC CAAATAAAAG ATTCACCAGT 780 
AAAAAAAAAA AAAA 

SEQ ID NO:90 PDT9 Protein sequence 
Protein Accession #: NP_150596 

1 11 21 31 41 51 

j t 1 ! I ! 

MVRAGAVGAH LPASGLDIFG DLKKMNKRQL YYQVLNFAMI VSSALKIWKG LIVLTGSESP 60 
IWVLSGSME PAFHRGDLLF LTNFREDFIR AGEIWFKVE GRDIPIVHRV IKVHEKDNGD 120 
IKFLTKGDNN EVDDRGLYKE GQNWLEKKDV VGRARGFLPY VGMVTIIMND YFKFKYALLA 180 
VMGAYVLLKR ES 



SEQ ID N0:91 PDV5 DNA SEQUENCE 

15 Nucleic Acid Accession #: NM_016590 

Coding sequence: 691-975 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

111)11 

GATTACTCAC ACAGTCTTGA AGATGCAATG TCAGCTATTT AGGACAGAAA CATCCAAGGC 60 

CGTGTCAGAA CTCAATTACG ACTACATATG CATTAAGGCA GGAACTGGCA GGCCTCAGGG 120 

TACGCCAACT ATAGGACTCG TGCTTCTCGT ACGCTGGGCT ATAATCTATG AAACTGAGCT 180 

CCAGAGCCAG CCAATCACTT AGCTCCTCAT AACAAGTCTA ACTGGCTCTG GAAAGCTGAA 240 

*"Z nc AGGGCTGCAC TGGAACAACA CAGATGAGAT ATTCTACACA TTAATCTACT TATCTGGAAT 300 

;5i25 CACTTTGCCT CTAAAGGCCA GAGAAAAATC ACAGCTTCCT TGTCGGAGGG GAAAAGGACA 360 

GGTGATCTGG GGAAAACGCA GCTACACCTG GAGCAAGGTC TCTTCCCGGC TTGGCAATCT 420 

*y CAGCTGTGCC GGCGCTACGG GACCCGAGCC GTCCCAGAAA CCAAAGGGCA GGCACGGCAG 480 

CAAACGCCTG AGTGCTGCTG CCTTCGGTGA CTATATGAGA ATGGAAACTT CTAAGGAAGC 540 

CAGGTTGTTA GAATTGTTAC CCCCTTTACT CAGAGATAAC ATAGATTATC CAGGCTGAGA 600 

TGGAAAACAA GCCCTTTATT GAATTTTCAA CACAGACTCC CTGCTTCTCA TCTCCTTAAT 660 

AAAATTTCAT TAAAATCCCC TTGAACTCCC ATGTTCAAAT CTCCATTTGT TGACAGACAA 720 

AGCCAACAAT ACTCTAAACT GAGGCCTGCA AGTCATTTCA TTTGTATTTT TGTCCAGAAA 780 

Lf? TTTCCCATAG GAAGACTTCA CCTCCTACAA CTCCGAAGAA AACCCTTACT GTCCAAGACC 840 

GTCACCAGCA ACCATCCGCA GTCATTCAAG TGGAAGCTTT CACAGCTTTT GTACATTCTC 900 

^■35 TGTGTCAATA TACAACTGAG TTACAGACTG TCCCCTGGCT CCCTGACCCT TACAAACACT 960 

AAAAGTTTTG TTTGACTCAA CTTCAAGCTG CTCATCTGTT AGTAAGTGAT GTTCACTCCA 1020 

. GAACACATTC ATGATGAGAA CTTTCTAAAA GACCAGCACT GCTCTTCCCC TCCTATAATC 1080 

ATAATAATCA TGATAACCTG AAACATGTTA CTGGGACTCG ACATTTTTCT GGGGATTGAA 1140 

ATCTTTAGTC CTTGGAGCTG TCACATAGCA GGGGC AACCT CACACTGAAA CAAAGGAAGT 1200 

GATGTCCCAT TATTATCCAC CCTGAGCCAC CATAATATGC TGTTTACATT TATTTTCTTC 1260 

AGCCTGTGCA AAACAAAGCA ATGGAAAAGG AAACTAAAAA ATATACATAC TAGTACCATT 1320 

ATCTTCTTTT GCCTAAAATT ACTAATGCAC CACGTCAGTC TGCTTCCTTC AGGCATCATT 1380 

CTCAATTCAT CAGGACTTGT ATTAGCAGGT TCTGGCTAGA GAGACTATCT CCTGTCATCA 1440 

CGATCAATTA ATGTTTTCTG GTGATCACAT CAGGCCCTAT CTAAGAAGCT CATGGTATAC 1500 

AAGGGTCACC CAAATAGCTG AGTGCAGTCC TTGCTCATAT TTCCTTCATC TTAACCCCGC 1560 

= ; AAACAAGAAT TAAGATGATC CCAATAAAAG AAAAATTGCT CAGGAAACTG AACCTTTTTC 1620 

TGAACCAAGC ACTGTCAGCA AATCTCAGGT ATTAGAGCAA CTATGGTTGA TTGAAAAGTG 1680 

TCTCAAAATC TGGGCCAAGA ATGATTGCTA GGTCCATAAG CTAATTTGTC TGGCCTTGCC 1740 

ATTTACGTAA GCCAAAGAAA GTCACTCATG AGTAAACTAT AGAAAACGTT CAGACCCATC 1800 

50 CTGTTAGTAT GTCAAATCAA CTAAGACTGG CAGGGTATTA ACTCCATTCC AGGTGACATG 1860 

GATAAAGAGC CCCATTATTT TCACAGTGCC AGCCTCTACC TAAGGAAACC CTAGACCTTG 1920 

GAACCAGTTT CCTGGTAGGG AACTGCTGAC AGTTTCAATG CTGACAGTTG GAGCCAATGC 1980 

CTCATAGTGT AAACTGAAAG AAAAATAGTT GCTTTTTAAA ATGTCAGCAA GAAGGCCTGC 2040 

CTCATCTTAA CAAAGCAAAA AAAAATGCTT TAATTCAAAT TAAAAATCAT GAT AC T AAAA 2100 
AAAAAAAA 



SEQ ID NO:92 PDV5 Protein sequence 
Protein Accession #: NP_057674 

60 1 11 21 31 41 51 

i I i I 1 I 

MQCQLFRTET SKAVSELNYD YICIKAGTGR PQGTPTIGLV LLVRWAIIYE TELQSQPIT 

, SEQ ID NO:93 PEE6 DNA SEQUENCE 

UJ Nucleic Acid Accession*: NM_002606 

Coding sequence: 61-1 842 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I 1 I I 1 ! 

CGCGGCGGCT GGCGTCGGGA AAGTACAGTA AAAAGTCCGA GTGCAGCCGC CGGGCGCAGG 60 

ATGGGATCCG GCTCCTCCAG CTACCGGCCC AAGGCCATCT ACCTGGACAT CGATGGACGC 120 

ATTCAGAAGG TAATCTTCAG CAAGTACTGC AACTCCAGCG ACATCATGGA CCTGTTCTGC 180 

ATCGCCACCG GCCTGCCTCG GAACACGACC ATCTCCCTGC TGACCACCGA CGACGCCATG 240 

GTCTCCATCG ACCCCACCAT GCCCGCGAAT TCAGAACGCA CTCCGTACAA AGTGAGACCT 300 

GTGGCCATCA AGCAACTCTC CGCTGGTGTC GAGGACAAGA GAACCACAAG CCGTGGCCAG 360 

TCTGCTGAGA GACCACTGAG GGACAGACGG GTTGTGGGCC TGGAGCAGCC CCGGAGGGAA 420 

GGAGCATTTG AAAGTGGACA GGTAGAGCCC AGGCCCAGAG AGCCCCAGGG CTGCTACCAG 480 

GAAGGCCAGC GCATCCCTCC AGAGAGAGAA GAATTAATCC AGAGCGTGCT GGCGCAGGTT 540 

GCAGAGCAGT TCTCAAGAGC ATTCAAAATC AATGAACTGA AAGCTGAAGT TGCAAATCAC 600 

TTGGCTGTCC TAGAGAAACG CGTGGAATTG GAAGGACTAA AAGTGGTGGA GATTGAGAAA 660 
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10 



15 



20 



30 



40 
45 
50 
55 



65 



TGCAAGAGTG ACATTAAGAA GATGAGGGAG GAGCTGGCGG CCAGAAGCAG CAGGACCAAC 720 

TGCCCCTGTA AGTACAGTTT TTTGGATAAC CACAAGAAGT TGACTCCTCG ACGCGATGTT 780 

CCCACTTACC CCAAGTACCT GCTCTCTCCA GAGACCATCG AGGCCCTGCG GAAGCCGACC 840 

TTTGACGTCT GGCTTTGGGA GCCCAATGAG ATGCTGAGCT GCCTGGAGCA CATGTACCAC 900 

GACCTCGGGC TGGTCAGGGA CTTCAGCATC AACCCTGTCA CCCTCAGGAG GTGGCTGTTC 960 

TGTGTCCACG ACAACTACAG AAACAACCCC TTCCACAACT TCCGGCACTG CTTCTGCGTG 1020 

GCCCAGATGA TGTACAGCAT GGTCTGGCTC TGCAGTCTCC AGGAGAAGTT CTCACAAACG 1080 

GATATCCTGA TCCTAATGAC AGCGGCCATC TGCCACGATC TGGACCATCC CGGCTACAAC 1140 

AACACGTACC AGATCAATGC CCGCACAGAG CTGGCGGTCC GCTACAATGA CATCTCACCG 1200 

CTGGAGAACC ACCACTGCGC CGTGGCCTTC CAGATCCTCG CCGAGCCTGA GTGCAACATC 1260 

TTCTCCAACA TCCCACCTGA TGGGTTCAAG CAGATCCGAC AGGGAATGAT CACATTAATC 1320 

TTGGCCACTG ACATGGCAAG ACATGCAGAA ATTATGGATT CTTTCAAAGA GAAAATGGAG 1380 

AATTTTGACT ACAGCAACGA GGAGCACATG ACCCTGCTGA AGATGATTTT GATAAAATGC 1440 

TGTGATATCT CTAACGAGGT CCGTCCAATG GAAGTCGCAG AGCCTTGGGT GGACTGTTTA 1500 

TTAGAGGAAT ATTTTATGCA GAGCGACCGT GAGAAGTCAG AAGGCCTTCC TGTGGCACCG 1560 

TTCATGGACC GAGACAAAGT GACCAAGGCC ACAGCCCAGA TTGGGTTCAT CAAGTTTGTC 1620 

CTGATCCCAA TGTTTGAAAC AGTGACCAAG CTCTTCCCCA TGGTTGAGGA GATCATGCTG 1680 

CAGCCACTTT GGGAATCCCG AGATCGCTAC GAGGAGCTGA AGCGGATAGA TGACGCCATG 1740 

AAAGAGTTAC AGAAGAAGAC TGACAGCTTG ACGTCTGGGG CCACCGAGAA GTCCAGAGAG 1800 

AGAAGCAGAG ATGTGAAAAA CAGTGAAGGA GACTGTGCC T GA GGAAAGCG GGGGGCGTGG 1860 

CTGCAGTTCT GGACGGGCTG GCCGAGCTGC GCGGGATCCT TGTGCAGGGA AGAGCTGCCC 1920 

TGGGCACCTG GCACCACAAG ACCATGTTTT CTAAGAACCA TTTTGTTCAC TGATACAAAA 1980 
AAAAAAAAAA A 



25 SEQ ID NO:94 PEE6 Protein sequence 
Protein Accession*: NPJW2597 



11 21 31 41 51 



MGSGSSSYRP KAIYLDIDGR IQKVIFSKYC NSSDIMDLFC IATGLPRNTT ISLLTTDDAM 60 

VSIDPTMPAN SERTPYKVRF VAIKQLSAGV EDKRTTSRGQ SAERPLRDRR WGLEQPRRE 120 

GAFESGQVEP RPREPQGCYQ EGQRIPPERE ELIQSVLAQV AEQFSRAFKI NELKAEVANH 180 

LAVLEKRVEL EGLKWEIEK CKSD1KKMRE ELAARSSRTN CPCKYSFLDN HKKLTPRRDV 240 

- PTYPKYLLSP ETIEALRKPT FDVWLWEPNE MLSCLEHMYH DLGLVRDFSI NPVTLRRWLF 300 

35 CVHDNYRNNP FHNFRHCFCV AQMMYSMVWL CSLQEKFSQT DILILMTAAI CHDLDHPGYN 360 

NTYQINARTE LAVRYNDISP LENHHCAVAF QILAEPECNI FSNIPPDGFK QIRQGMITLI 420 

LATDMARHAE IMDSFKEKME NFDYSNEEHM TLLKMILIKC CDISNEVRPM EVAEPWVDCL 480 

LEEYFMQSDR EKSEGLPVAP FMDRDKVTKA TAQIGFIKFV LIPMFETVTK LFPMVEEIML 540 
QFLWESRDRY EELKRIDDAM KELQKKTDSL TSGATEKSRE RSRDVKNSEG DCA 



SEQ ID NO:95 PEG4 DNA SEQUENCE 

Nucleic Acid Accession #: none 

Coding sequence: 41-559 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I i I I 

CAGTCACAGG CGAGAGCCYT GGGATGCACC GGCCAGAGGC ATGCTGCTGC TGCTCACGCT 60 

TGCCCTCCTG GGGGGCCCCA CCTGGGCAGG GAAGATGTAT GGCCCTGGAG GAGGCAAGTA 120 

TTTCAGCACC ACTGAAGACT ACGACCATGA AATCACAGGG CTGCGGGTGT CTGTAGGTCT 180 

TCTCCTGGTG AAAAGTGTCC AGGTGAAACT TGGAGACTCC TGGGACGTGA AACTGGGAGC 240 

CTTAGGTGGG AATACCCAGG AAGTCACCCT GCAGCCAGGC GAATACATCA CAAAAGTCTT 300 

TGTCGCCTTC CAAGCTTTCC TCCGGGGTAT GGTCATGTAC ACCAGCAAGG ACCGCTATTT 360 

CTATTTTGGG AAGCTTGATG GCCAGATCTC CTCTGCCTAC CCCAGCCAAG AGGGGCAGGT 420 

GCTGGTGGGC ATCTATGGCC AGTATCAACT CCTTGGCATC AAGAGCATTG GCTTTGAATG 480 

GAATTATCCA CTAGAGGAGC CGACCACTGA GCCACCAGTT AATCTCACAT ACTCAGCAAA 540 

CTCACCCGTG GGTCGCTAGG GTGGGGTATG GGGCCATCCG AGCTGAGGCC ATCTGTGTGG 600 

TGGTGGCTGA TGGTACTGGA GTAACTGAGT CGGGACGCTG AATCTGAATC CACCAATAAA 660 
TAAAGCTTCT GCAGAATCAG TGAAAAAAAA A 



60 SEQ ID NO:96 PEG4 Protein sequence 

Protein Accession #: FGENESH preolcted 



1 11 21 31 41 51 

I I I I I I 

MLLLLTLALL GGPTWAGKMY GPGGGKYFST TEDYDHEITG LRVSVGLLLV KSVQVKLGDS 60 
WDVKLGALGG NTQEVTLQPG EYITKVFVAF QAFLRGMVMY TSKDRYFYFG KLDGQISSAY 120 
PSQEGQVLVG IYGQYQLLGI KSIGFEWNYP LEEPTTEPPV NLTYSANSPV GR 



70 SEQ ID NO:97 PEL9 DNA SEQUENCE 

Nucleic Acid Accession*: NM_006953 

Coding sequence: 33-896(underlined sequences correspond to start and stop codons) 

l 11 21 31 41 51 

75 | | S I I I 

CCGTTCCGCG CTCTGGCGGC TCCTCCCGGG CGATGCCTCC GCTCTGGGCC CTGCTGGCCC 60 

TCGGCTGCCT GCGGTTCGGC TCGGCTGTGA ACCTGCAGCC CCAACTGGCC AGTGTGACTT 120 

TCGCCACCAA CAACCCCACA C TT ACCAC TG TGGCCTTGGA AAAGCCTCTC TGCATGTTTG 180 

ACAGCAAAGA GGCCCTCACT GGCACCCACG AGGTCTACCT GTATGTCCTG GTCGACTCAG 240 

CCATTTCCAG GAATGCCTCA GTGCAAGACA GCACCAACAC CCCACTGGGC TCAACGTTCC 300 



80 
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TACAAACAGA GGGTGGGAGG ACAGGTCCCT ACAAAGCTGT GGCCTTTGAC CTGATCCCCT 360 

GCAGTGACCT GCCCAGCCTG GATGCCATTG GGGATGTGTC CAAGGCCTCA CAGATCCTGA 420 

ATGCCTACCT GGTCAGGGTG GGTGCCAACG GGACCTGCCT GTGGGATCCC AACTTCCAGG 480 

GCCTCTGTAA CGCACCCCTG TCGGCAGCCA CGGAGTACAG GTTCAAGTAT GTCCTGGTCA 540 

5 ATATGTCCXC GGGCTTGGTA GAGGACCAGA CCCTGTGGTC GGACCCCATC CGCACCAACC 600 

AGCTCACCCC ATACTCGACG ATCGACACGT GGCCAGGCCG GCGGAGCGGA GGCATGATCG 660 

TCATCACTTC CATCCTGGGC TCCCTGCCCT TCTTTCTACT TGTGGGTTTT GCTGGCGCCA 720 

TTGCCCTCAG CCTCGTGGAC ATGGGGAGTT CTGATGGGGA AACGACTCAC GACTCCCAAA 7 80 

TCACTCAGGA GGCTGTTCCC AAGTCGCTGG GGGCCTCGGA GTCTTCCTAC ACGTCCGTGA 840 

10 ACCGGGGGCC GCCACTGGAC AGGGCTGAGG TGTATTCCAG CAAGCTCCAA GACTGAGCCC 900 

AGCACCACCC CTGGGCAGCA GCATCCTCCT CTCTGGCCTT GCCCCAGGCC CTGCAGCGGT 960 

GGTTGTCACA CCCTGACTTC AGGGAAGGTG AAACAGGGCT TGTCCCTCCA ACTGCAGGAA 1020 
AACCCTTAAT AAAATCTTCT GATGAGTTCT AAAAAAAAA 

15 SEQ ID NO:98 PEL9 Protein sequence 
Protein Accession #: NP_008884 

1 11 21 31 41 51 

nn ! i I I 1 I 

ZU MPPLWALLAL GCLRPGSAVN LQPQLASVTF ATNNPTLTTV ALEKPLCMFD SKEALTGTHE 60 

VYLYVLVDSA ISRNASVQDS TNTPLGSTFL QTEGGRTGPY KAVAFDLIPC SDLPSLDAIG 120 

DVSKASQILN AYLVRVGANG TCLWDPNFQG LCNAPLSAAT EYRFKYVLVN MSTGLVEDQT 180 

LWSDPIRTKQ LTPYSTIDTW PGRRSGGMIV ITSILGSLPF FLLVGFAGAI ALSLVDMGSS 240 
DGETTHDSQI TQEAVPKSLG ASESSYTSVN RGPPLDRAEV YSSKLQD 



125 



,30 



40 



45 



55 
60 



SEQ 10 NO:99 PEN1 DNA SEQUENCE 

Nucleic Acid Accession #: NM_0 1 239 1 

Coding sequence: 41 6-1 423 (underlined sequences correspond to start and stop codons) 



11 21 31 41 51 

<• I I ! I I I 

GTCTGACTTC CTCCCAGCAC ATTCCTGCAC TCTGCCGTGT CCACACTGCC CCACAGACCC 60 

] AGTCCTCCAA GCCTGCTGCC AGCTCCCTGC AAGCCCCTCA GGTTGGGCCT TGCCACGGTG 120 

"r CCAGCAGGCA GCCCTGGGCT GGGGGTAGGG GACTCCCTAC AGGCACGCAG CCCTGAGACC 180 

i 35 TCAGAGGGCC ACCCCTTGAG GGTGGCCAGG CCCCCAGTGG CCAACCTGAG TGCTGCCTCT 240 

GCCACCAGCC CTGCTGGCCC CTGGTTCCGC TGGCCCCCCA GATGCCTGGC TGAGACACGC 300 

' CAGTGGCCTC AGCTGCCCAC ACCTCTTCCC GGCCCCTGAA GTTGGCACTG CAGCAGACAG 360 

CTCCCTGGGC ACCAGGCAGC TAACAGACAC AQCCQCCAGC CCAAACAGCA GCGGCATGGG 42 0 

CAGCGCCAGC CCGGGTCTGA GCAGCGTATC CCCCAGCCAC CTCCTGCTGC CCCCCGACAC 480 

GGTGTCGCGG ACAGGCTTGG AGAAGGCGGC AGCGGGGGCA GTGGGTCTCG AGAGACGGGA 540 

CTGGAGTCCC AGTCCACCCG CCACGCCCGA GCAGGGCCTG TCCGCCTTCT ACCTCTCCTA 600 

CTTTGACATG CTGTACCCTG AGGACAGCAG CTGGGCAGCC AAGGCCCCTG GGGCCAGCAG 660 

TCGGGAGGAG CCACCTGAGG AGCCTGAGCA GTGCCCGGTC ATTGACAGCC AAGCCCCAGC 720 

GGGCAGCCTG GACTTGGTGC CCGGCGGGCT GACCTTGGAG GAGCACTCGC TGGAGCAGGT 780 

GCAGTCCATG GTGGTGGGCG AAGTGCTCAA GGACATCGAG ACGGCCTGCA AGCTGCTCAA 840 

CATCACCGCA GATCCCATGG ACTGGAGCCC CAGCAATGTG CAGAAGTGGC TCCTGTGGAC 900 

AGAGCAC CAA TACCGGCTGC CCCCCATGGG CAAGGCCTTC CAGGAGCTGG CGGGCAAGGA 960 

GCTGTGCGCC ATGTCGGAGG AGCAGTTCCG CCAGCGCTCG CCCCTGGGTG GGGATGTGCT 1020 

GCACGCCCAC CTGGACATCT GGAAGTCAGC GGCCTGGATG AAAGAGC GG A CTTCACCTGG 1080 

50 GGCGATTCAC TACTGTGCCT CGACCAGTGA GGAGAGCTGG ACCGACAGCG AGGTGGACTC 1140 

ATCATGCTCC GGGCAGCCCA TCCACCTGTG GCAGTTCCTC AAGGAGTTGC TACTCAAGCC 1200 

CCACAGCTAT GGCCGCTTCA TTAGGTGGCT CAACAAGGAG AAGGGCATCT TCAAAATTGA 1260 

GGACTCAGCC CAGGTGGCCC GGCTGTGGGG CATCCGCAAG AACCGTCCCG CCATGAACTA 1320 

CGACAAGCTG AGCCGCTCCA TCCGCCAGTA TTACAAGAAG GGCATCATCC GGAAGCCAGA 1380 

CATCTCCCAG CGCCTCGTCT ACCAGTTCGT GGACCCCATC TGAGTGCCTG GCCCAGGGCC 1440 

TGAAACCCGC CCTCAGGGGC CTCTCTCCTG CCTGCCCTGC CTCAGCCAGG CCCTGAGATG 1500 

GGGGAAAACG GGCAGTCTGC TCTGCTGCTC TGACCTTCCA GAGCCCAAGG TCAGGGAGGG 1560 

GCAACCAACT GCCCCAGGGG GATATGGGTC CTCTGGGGCC TTCGGGACCA TGGGGCAGGG 1620 

GTGCTTCCTC CTCAGGCCCA GCTGCTCCCC TGGAGGACAG AGGGAGACAG GGCTGCTCCC 1680 

CAACACCTGC CTCTGACCCC AGCATTTCCA GAGCAGAGCC TACAGAAGGG CAGTGACTCG 1740 

ACAAAGGCCA CAGGCAGTCC AGGCCTCTCT CTGCTCCATC CCCCTGCCTC CCATTCTGCA 1800 

CCACACCTGG CATGGTGCAG GGAGACATCT GCACCCCTGA GTTGGGCAGC CAGGAGTGCC 1860 
CCCGGGAATG GATAATAAAG ATACTAGAGA ACTG 

65 SEQ ID NO:100 PEN1 Protein sequence 
Protein Accession #: NP_036523 



70 
75 
80 



1 


11 


21 


31 


41 


51 




1 

MGSASPGLSS 


i 

VSPSHLLLPP 


1 

DTVSRTGLEK 


1 

AAAGAVGLER 


i 

RDWSPSPPAT 


1 

PEQGLSAFYL 


60 


SYFDMLYPED 


S SWAAKAPG A 


SSREEPPEEP 


EQCPVIDSQA 


PAGSLDLVPG 


GLTLEEHSLE 


120 


QVQSMWGEV 


LKDIETACKL 


LNITADPMDW 


SPSNVQKWLL 


WTEHQYRLPP 


MGKAFQEXiAG 


180 


KELCAMSEEQ 


FRQRSPLGGD 


VLHAHLDIWK 


SAAWMKERTS 


PGAIHYCAST 


SEESWTDSEV 


240 


DSSCSGQPIH 


LWQFLKELLL KPHSYGRFIR 


WLNKEKG1FK 


IEDSAQVARL 


VJGIRKNRPAM 


300 


NYDKLSRSIR 


QYYKKGIIRK 


PDISQRLVYQ 


FVHPI 









SEQ ID NO:101 PEN3 DNA SEQUENCE 

Nucleic Acid Accession*: NM_0OO742 

Coding sequence: 555-21 44 (underlined sequences correspond to start and stop codons) 



337 



GAGAGAACAG 
GCTTGGGTTT 
CTGCATGAAG 
AGAGCTTGCC 
GGGGTGTCTC 
GCTCTATTCT 
CCAAGCCAGG 
TCGGTGGTGA 
TCTGCTGGGG 
GCTCAGGAGA 
TGTGGTGGCT 
CTCCTGGAGA 
CCGAGACTGA 
CGGTGCCCAA 
TCGATGTGGA 
GCGACTACAA 
CTTCTGAGAT 
CAGTGACCCA 
CGGCCATCTA 
ACTGCAAGAT 
TGGAGCAGAC 
CCACGGGCAC 
CCTACGCCTT 
GCCTGCTCAT 
AGATCACGCT 
AGATCATCCC 
TGATCTTCGT 
CCCCCAGCAC 
GGTGGCTTCT 
AGCTCAGCCC 
TGGTGGAGGA 
TCTGCAGCCA 
AGGAGGGTGA 
TTGCCGACCA 
TTGCCATGGT 
CCATCGGCCT 
CTGGCTCCCA 
ATTTGGAGAT 
CCAGGTGAGG 
GGG TGCTGAG 
GCGGGAGGCA 
ATGGATGGTT 
CCAGGCTTCT 
CGGCCCCCAG 
TACGCGTGCA 



11 

i 

CGTGAGCCTG 
CACCTGCAGA 
CCGTTCTGGC 
CAGCTGTCCC 
CTAAACCCTC 
GTACCTGCCA 
CTGGTTCTCT 
GAGGAAGCCT 
ACATGGTCCA 
AGCCATGGGC 
CCTTCTGACC 
CCCACTCTCC 
GGACCGGCTC 
CACTTCAGAC 
TGAGAAGAAC 
ACTGCGCTGG 
GATCTGGATC 
CATGACCAAG 
CAAGAGCTCC 
GAAGTTTGGC 
TGTGGACCTG 
CTACAACAGC 
CGTCATCCGG 
CTCCTGCCTC 
GTGCATTTCG 
GTCCACCTCG 
CACCCTGTCC 
CCACACCATG 
GATGAACCGG 
CTCTTATCAC 
GGAGGACAGA 
CGGCCACCTG 
GCTGCTGCTA 
CCTGCGGTCT 
CATCGACAGG 
CTTTCTGCCT 
GGGCAAAGGG 
GAGCCCAAAG 
TCTCTCTAAG 
CTGTATGGTC 
GGCCTGCACC 
GGATACAGGT 
CCTTGACGTC 
GAGGTCTGGC 
GCAGGCAAAC 



21 

i 

TGTGCTTGTG 
ATCGCTTGTG 
TGCCAGAGCT 
CGGGAAGCCA 
ACTCTTCAGC 
CTCTATTTCT 
GCATCCTTTC 
CGCAGAATCC 
TGGTGCAACC 
CCCTCCTGTC 
CCAGCAGGTG 
TCTCCCAGTC 
TTCAAACACC 
GTGGTGATTG 
CAAATGATGA 
AACCCCGCTG 
CCCGACATTG 
GCCCACCTCT 
TGCAGCATCG 
TCCTGQACTT 
AAGGACTACT 
AAGAAGTACG 
CGGCTGCCGC 
ACTGTGCTGG 
GTGCTGCTGT 
CTGGTCATCC 
ATCGTCATCA 
CCCCACTGGG 
CCCCCACCAC 
TGGCTGGAGA 
TGGGCATGTG 
CACTCTGGGG 
TCACCCCACA 
GAGGATGCTG 
ATCTTCCTCT 
CCGTTCCTAG 
GAGGGTTCTT 
TGCCAGGGAG 
TCAGGCTGGG 
CAGCAGGGGA 
TGATGTGGAG 
GGCTGGGCTA 
ATTCCTCTCC 
AGAGCTGAGA 
AAGA 



31 
I 

TGCTGAG CCC 
CTGGGCTGCC 
GGACAGCCCC 
AATGCCTCTC 
CTCTGTTTGA 
GGGGTGACTT 
AATGACCTGT 
AGCAGAATCC 
CACAGCAAAG 
CTGTGTTCCT 
GAGAGGAAGC 
CCACGGCATT 
TCTTCCGGGG 
TGCGCTTTGG 
CCACCAACGT 
ATTTTGGCAA 
TTCTCTACAA 
TCTCCACGGG 
ACGTCACCTT 
ATGACAAGGC 
GGGAGAGCGG 
ACTGCTGCGC 
TCTTCTACAC 
TCTTCTACCT 
CACTCACCGT 
CGCTCATCGG 
CCGTCTTCGT 
TGCGGGGGGC 
CCGTGGAGCT 
GCAACGTGGA 
CAGGTCATGT 
CCTCAGGTCC 
TGCAGAAGGC 
ACTCTTCGGT 
GGCTGTTTAT 
CTGGAATGAT 
GGATGTGGAA 
AACAGCCAGG 
GTTGAAGTTT 
GTAATAAGGG 
GTACAGGCAG 
TTCCATCCAT 
TTCCTTGCTG 
GCCATGGCCT 



41 

I 

TCATCCCCTC 
TGGGCTGTCC 
AGGAAAACCC 
ATGTAAGTCT 
CCATGAAATG 
TTGTCAGCTG 
TTTCTTCTGT 
TCACAGAATC 
CCCTGACCTG 
GTCCTTCACA 
TAAGCGCCCA 
GCCGCAGGGA 
CTACAACCGC 
ACTGTCCATC 
CTGGCTAAAA 
CATCACATCT 
CAATGCAGAT 
CACTGTGCAC 
CTTCCCCTTC 
CAAGATCGAC 
CGAGTGGGCC 
CGAGATCTAC 
CATCAACCTC 
GCCCTCCGAC 
CTTCCTGCTG 
CGAGTACCTG 
GCTCAATGTG 
CCTTCTGGGC 
CTGCCACCCC 
TGCCGAGGAG 
GGCCCCCTCT 
CAAGGCTGAG 
ACTGGAAGGT 
GAAGGAGGAC 
CATCGTCTGC 
CTGACTGCAC 
GGGCTTTGAA 
TGAGGTGGGA 
GGAGTCTGTC 
CTCTTCCGGA 
ATCTTCCCTA 
CTGGAAGCAC 
CAAAATGGCT 
GCAGGGGCTC 



51 

I 

CTGGGGCCAG 
TCAGTGGCAC 
ACCTCTCTGC 
TCTGCTCGAC 
AAGTGACTGA 
CCCAGAATCT 
AACCACAGGT 
CAGCAGCAGC 
ACCTCCTGAT 
AAGCTCAGCC 
CCTCCCAGGG 
GGCTCGCATA 
TGGGCGCGCC 
GCTCAGCTCA 
CAGGAGTGGA 
CTCAGGGTCC 
GGGGAGTTTG 
TGGGTGCCCC 
GACCAGCAGA 
CTGGAGCAGA 
ATCGTCAATG 
CCCGACGTCA 
ATCATCCCCT 
TGCGGCGAGA 
CTCATCACTG 
CTGTTCACCA 
CACCACCGCT 
TGTGTGCCCC 
CTACGCCTGA 
AGGGAGGTGG 
GTGGGCACCC 
GCTCTGCTGC 
GTGCACTACA 
TGGAAGTATG 
TTCCTGGGGA 
CTCCCTCGAG 
CAATGTTTAG 
GGTTGGAGAG 
CGAGTTTGCA 
AGGGGAGGAA 
CCGGGGAGGG 
ATTTGAGCCT 
CTGCACCAGC 
CATATGTCCC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 



SEQ ID NO:102 PEN3 Protein sequence 
Protein Accession* NPJXX>733 



11 

I 



21 



31 



MGPSCPVFLS FTKLSLWWLL LTPAGGEEAK RPPPRAPGDP 
RLFKHLFRGY NRWARPVPNT SDWIVRFGL SIAQLIDVDE 
RWNPADFGNI TSLRVPSEMI WIPDIVLYNN ADGEFAVTHM 
SSCSIDVTFF PFDQQNCKMK FGSWTYDKAK IDLEQMEQTV 
NSKKYDCCAE IYPDVTYAFV IRRLPLFYTI NLIIPCLLIS 
ISVLLSLTVF LLLITEIIPS TSLVIPLIGE YLLFTMIFVT 
TMPHWVRGAL LGCVPRWLLM NRPPPPVELC HPLRLKLSPS 
DRWACAGHVA PSVGTLCSHG HLHSGASGPK AEALLQEGEL 
RSEDADSSVK EDWKYVAMVI DRIFLWLFII VCFLGTIGLF 



41 

I 

LSSPSPTALP 
KNQMMTTNVW 
TKAHLFSTGT 
DLKDYWESGE 
CLTVLVFYLP 
LSIVITVFVL 
YHWLESNVDA 
LLSPHMQKAL 
LPPFLAGMI 



51 
I 

QGGSHTETED 
LKQEWSDYKL 
VHWVPPAIYK 
WAIVNATGTY 
SDCGEKITLC 
NVHHRSPSTH 
EEREVWEEE 
EGVHYIADHL 



60 
120 
180 
240 
300 
360 
420 
480 



SEQ ID N0:103 PEU4 DNA SEQUENCE 

Nucleic Acid Accession #: NMJH8670 

Coding sequence: 87-893 (underlined sequences correspond to start and stop codons) 



11 



21 



31 



41 



51 



CACGAGGCTG 
CGGCCCCCAG 
CCTGGATGCT 
GCGGCCGCTC 
TGGCGAGCCC 
GCGGCGCGCG 
AACTGCGCAT 
CCGTGGCGCC 
ATATCGGCCA 
GGCAGCGCGG 
CGCAGATGCA 



GAAGGGGCCA 
ACGCGCCGCC 
CTCTGCGGCC 
CCTCGTCTCG 
CGCGCGGCCA 
CAGCAGCCGC 
GCGCACGCTG 
CGCGGGCCAG 
CCTGTCGGCC 
TGACGCGGGG 
GACACGGACG 



CTTCACACCT 
GCTGCCATGG 
TGGGGCCCAA 
TCCCCAGACT 
GGCACCCTCC 
CTGGGCAGCG 
GCCCGCGCCC 
AGCCTGACCA 
GTGCTAGGCC 
TCCCCTCGGG 
CAGGCTGAGG 



CGGGCTCGGC 
CCCAGCCCCT 
CTCGGCGGCC 
CATGGGGCAG 
GGGACCCCCG 
GGCAGAGGCA 
TGCACGAGCT 
AGATCGAGAC 
TCAGCGAGGA 
GCTGCCCGCT 
GGCAGGGGCA 



ATAAAGCGGC 
GTGCCCGCCG 
GCCGCCCTCC 
CACCCCAGCC 
CGCCCCCTCC 
GAGCGCCAGT 
GCGCCGCTTT 
GCTGCGCCTG 
GAGTCTCCAG 
GTGCCCCGAC 
GGGGCGCGGG 



CGCCGGCCGC 
CTCTCCGAGT 
GACAAGGACT 
GACAGCCCCG 
GTAGGTAGGC 
GAGCGGGAGA 
CTACCGCCGT 
GCTATCCGCT 
CGCCGGTGCC 
GACTGCCCCG 
CTGGGCCTGG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 



338 



TATCCGCCGT CCGCGCCGGG GCGTCCTGGG GATCCCCGCC TGCCTGCCCC GGAGCCCGAG 720 

CTGCACCCGA GCCGCGCGAC CCGCCTGCGC TGTTCGCCGA GGCGGCGTGC CCGGAAGGGC 780 

AGGCGATGGA GCCAAGCCCA CCGTCCCCGC TCCTTCCGGG CGACGTGCTG GCTCTGTTGG 840 

AGACCTGGAT GCCCCTCTCG CCTCTGGAGT GGCTGCCTGA GGAGCCCAAG TGACAAGGGA 900 

CAACTGACGC CGTCTCTGTG AGCACCGAGG CTTTTTGGCC TCAGCACCTT CGAAGTGGTT 960 

CCTTGGCAGA CTGCCTTTCC TGGAAGAGGG CACGGGCGAT CCCGACGGGG GCATTCCTGC 1020 

GGGTGAGAGC CGTCCCCACC GCGGCGGCCC TTCTCAGCCC CTCCCTCCAT GGAGGGACCC 1080 

ATAGGGCTAG ACACTTTGAG GCAAGCAGGA GGCTCTGCCT AATGTGAATT TATTTATTTG 1140 
TGAATAAACT GTACTGGTGT CAAAAAAAAA AAAAAAAAAA A 

SEQ ID NO:104 PEU4 Protein sequence 
Protein Accession #: NP_06 1 1 40 

1 11 21 31 41 51 

I I I I I i 

MAQPLCPPLS ESWMLSAAWG PTRRPPPSDK DCGRSLVSSP DSWGSTPADS PVASPARPGT 60 

LRDPRAPSVG RRGARSSRLG SGQRQSASER EKLRMRTLAR ALHELRRFLP PSVAPAGQSL 120 

TKIETXRLAI RYIGHLSAVL GLSEEShQKR CRQRGDAGSP RGCPLCPDDC PAQMQTRTQA 180 

EGQGQGRGLG LVSAVRAGAS WGSPPACPGA RAAPEPRDPP ALFAEAACPE GQAMEPSPPS 240 
PLLPGDVLAL LETWMPLSPL EWLPEEPK 

SEO 10 NO:105 PEU5 DNA SEQUENCE 

Nucleic Actd Accession #: NM__01 7636 

Coding sequence: 324-33/4 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I I ! 1 1 

CCACGGAGAA GCCCACCGAT GCCTACGGAG AGCTGGACTT CACGGGGGCC GGCCGCAAGC 60 

ACAGCAATTT CCTCCGGCTC TCTGACCGAA CGGATCCAGC TGCAGTTTAT AGTCTGGTCA 120 

CACGCACATG GGGCTTCCGT GCCCCGAACC TGGTGGTGTC AGTGCTGGGG GGATCGGGGG 180 

GCCCCGTCCT CCAGACCTGG CTGCAGGACC TGCTGCGTCG TGGGCTGGTG CGGGCTGCCC 240 

AGAGCACAGG AGCCTGGATT GTCACTGGGG GTCTGCACAC GGGCATCGGC CGGCATGTTG 300 

GTGTGGCTGT ACGGGACCAT CAGATGGCCA GCACTGGGGG CACCAAGGTG GTGGCCATGG 360 

GTGTGGCCCC CTGGGGTGTG GTCCGGAATA GAGACACCCT CATCAACCCC AAGGGCTCGT 420 

TCCCTGCGAG GTACCGGTGG CGCGGTGACC CGGAGGACGG GGTCCAGTTT CCCCTGGACT 480 

ACAACTACTC GGCCTTCTTC CTGGTGGACG ACGGCACACA CGGCTGCCTG GGGGGCGAGA 540 

ACCGCTTCCG CTTGCGCCTG GAGTCCTACA TCTCACAGCA GAAGACGGGC GTGGGAGGGA 600 

CTGGAATTGA CATCCCTGTC CTGCTCCTCC TGATTGATGG TGATGAGAAG ATGTTGACGC 660 

GAATAGAGAA CGCCACCCAG GCTCAGCTCC CATGTCTCCT CGTGGCTGGC TCAGGGGGAG 720 

CTGCGGACTG CCTGGCGGAG ACCCTGGAAG ACACTCTGGC CCCAGGGAGT GGGGGAGCCA 780 

GGCAAGGCGA AGCCCGAGAT CGAATCAGGC GTTTCTTTCC CAAAGGGGAC CTTGAGGTCC 840 

TGCAGGCCCA GGTGGAGAGG ATTATGACCC GGAAGGAGCT CCTGACAGTC TATTCTTCTG 900 

AGGATGGGTC TGAGGAATTC GAGACCATAG TTTTGAAGGC CCTTGTGAAG GCCTGTGGGA 960 

GCTCGGAGGC CTCAGCCTAC CTGGATGAGC TGCGTTTGGC TGTGGCTTGG AACCGCGTGG 1020 

ACATTGCCCA GAGTGAACTC TTTCGGGGGG ACATCCAATG GCGGTCCTTC CATCTCGAAG 1080 

CTTCCCTCAT GGACGCCCTG CTGAATGACC GGCCTGAGTT CGTGCGCTTG CTCATTTCCC 1140 

ACGGCCTCAG CCTGGGCCAC TTCCTGACCC CGATGCGCCT GGCCCAACTC TACAGCGCGG 1200 

CGCCCTCCAA CTCGCTCATC CGCAACCTTT TGGACCAGGC GTCCCACAGC GCAGGCACCA 1260 

AAGCCCCAGC CCTAAAAGGG GGAGOVGCGG AGCTCCGGCC CCCTGACGTG GGGCATGTGC 1320 

TGAGGATGCT GCTGGGGAAG ATGTGCGCGC CGAGGTACCC CTCCGGGGGC GCCTGGGACC 1380 

CTCACCCAGG CCAGG GCTTC GGGGAGAGCA TGTATCTGCT CTCGGACAAG GCCACCTCGC 1440 

CGCTCTCGCT GGATGCTGGC CTCGGGCAGG CCCCCTGGAG CGACCTGCTT CTTTGGGCAC 1500 

TGTTGCTGAA CAGGGCACAG ATGGCCATGT ACTTCTGGGA GATGGGTTCC AATGCAGTTT 1560 

CCTCAGCTCT TGGGGCCTGT TTGCTGCTCC GGGTGATGGC ACGCCTGGAG CCTGACGCTG 1620 

AGGAGGCAGC ACGGAGGAAA GACCTGGCGT TCAAGTTTGA GGGGATGGGC GTTGACCTCT 1680 

TTGGCGAGTG CTATCGCAGC AGTGAGGTGA GGGCTGCCCG CCTCCTCCTC CGTCGCTGCC 1740 

CGCTCTGGGG GGATGCCACT TGCCTCCAGC TGGCCATGCA AGCTGACGCC CGTGCCTTCT 1800 

TTGCCCAGGA TGGGGTACAG TCTCTGCTGA CACAGAAGTG GTGGGGAGAT ATGGCCAGCA 1860 

CTACACCCAT CTGGGCCCTG GTTCTCGCCT TCTTTTGCCC TCCACTCATC TACACCCGCC 1920 

TCATCACCTT CAGGAAATCA GAAGAGGAGC CCACACGGGA GGAGCTAGAG TTTGACATGG 1980 

ATAGTGTCAT TAATGGGGAA GGGCCTGTCG GGACGGCGGA CCCAGCCGAG AAGACGCCGC 2040 

TGGGGGTCCC GCGCCAGTCG GGCCGTCCGG GTTGCTGCGG GGGCCGCTGC GGGGGGCGCC 2100 

GGTGCCTACG CCGCTGGTTC CACTTCTGGG GCGCGCCGGT GACCATCTTC ATGGGCAACG 2160 

TGGTCAGCTA CCTGCTGTTC TTGCTGCTTT TCTCGCGGGT GCTGCTCGTG GATTTCCAGC 2220 

CGGCGCCGCC CGGCTGCCTG GAGCTGCTGC TCTATTTCTG GGCTTTCACG CTGCTGTGCG 2280 

AGGAACTGCG CCAGGGCCTG AGCGGAGGCG GGGGCAGCCT CGCCAGCGGG GGCCCCGGGC 2340 

CTGGCCATGC CTCACTGAGC CAGCGCCTGC GCCTCTACCT CGCCGACAGC TGGAACCAGT 2400 

GCGACCTAGT GGCTCTCACC TGCTTCCTCC TGGGCGTGGG CTGCCGGCTG ACCCCGGGTT 2460 

TGTACCACCT GGGCCGCACT GTCCTCTGCA TCGACTTCAT GGTTTTCACG GTGCGGCTGC 2520 

TTCACATCTT CACGGTCAAC AAACAGCTGG GGCCCAAGAT CGTCATCGTG AGCAAGATGA 2580 

TGAAGGACGT GTTCTTCTTC CTCTTCTTCC TCGGCGTGTG GCTGGTAGCC TATGGCGTGG 2640 

CCACGGAGGG GCTCCTGAGG CCACGGGACA GTGACTTCCC AAGTATCCTG CGCCGCGTCT 2700 

TCTACCGTCC CTACCTGCAG ATCTTCGGGC AGATTCCCCA GGAGGACATG GACGTGGCCC 2760 

TCATGGAGCA CAGCAACTGC TCGTCGGAGC CCGGCTTCTG GGCACACCCT CCTGGGGCCC 2820 

AGGCGGGCAC CTGCGTCTCC CAGTATGCCA ACTGGCTGGT GGTGCTGCTC CTCGTCATCT 2880 

TCCTGCTCGT GGCCAACATC CTGCTGGTCA ACTTGCTCAT TGCCATGTTC AGTTACACAT 2940 

TCGGCAAAGT ACAGGGCAAC AGCGATCTCT ACTGGAAGGC GCAGCGTTAC CGCCTCATCC 3000 

GGGAATTCCA CTCTCGGCCC GCGCTGGCCC CGCCCTTTAT CGTCATCTCC CACTTGCGCC 3060 

TCCTGCTCAG GCAATTGTGC AGGCGACCCC GGAGCCCCCA GCCGTCCTCC CCGGCCCTCG 3120 

AGCATTTCCG GGTTTACCTT TCTAAGGAAG CCGAGCGGAA GCTGCTAACG TGGGAATC GG 3180 
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TGCATAAGGA GAACTTTCTG CTGGCACGCG CTAGGGACAA GCGGGAGAGC GACTCCGAGC 3240 

GTCTGGAGCG CACGTCCCAG AAGGTGGACT TGGCACTGAA ACAGCTGGGA CACATCCGCG 3300 

AGTACGAACA GCGCCTGAAA GTGCTGGAGC GGGAGGTCCA GCAGTGTAGC CGCGTCCTGG 3360 

GGTGGGTGAC GTAGGCCGTT AGCAGCTCTG CCATGTTGCC CTCAGGTGGG CCGCCACCCC 3420 

TTGACCTGCA TGGGTCCAAA GAGTGAGCCA TGCTGGCGGA TTTTAAGGAG AAGCCCCCAC 3480 

AGGGGATTTT GCTCTTAGAG TAAGGCTCAT GTGGGCCTCG GCCCCCGCAC CTGGTGGCCT 3540 

TGTCCTTGAG GTGAGCCCCA TGTCCATCTG GGCCACTGTC AGGACCACCT TTGGGAGTGT 3600 

CATCCTTACA AACCACAGCA TGCCCGGCTC CTCCCAGAAC CAGTCCCAGC CTGGGAGGAT 3660 

CAAGGCCTGG ATCCCGGGCC GTTATCCATC TGGAGGCTGC AGGGTCCTTG GGGTAACAGG 3720 

GACCACAGAC CCCTCACCAC TCACAGATTC CTCACACTGG GGAAATAAAG CCATTTCAGA 3780 
GGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 

SEQ ID NO:106 PEUS Protein sequence 
Protein Accession #: NP.060106 

1 11 21 31 41 51 

MASTGGTKW AMGVAPWGW RNRDTLINPK GSFPARYRWR GDPEDGVQFP LDYNYSAFFL 60 

VDDGTHGCLG GENRFRLRLE SYISQQKTGV GGTGIDIPVL LLLIDGDEKM LTRIENATCA 12 0 

QLPCLLVAGS GGAADCLAET LEDTLAPGSG GARQGEARDR IRRFFPKGDL EVLQAQVERI 180 

MTRKELLTVY SSEDGSEEFE TIVLKALVKA CGSSEASAYL DELRLAVAWN RVDIAQSELF 240 

RGDIQWRSFH LEASLMDALL NDRPEFVRLL ISHGLSLGHF LTPMRLAQLY SAAPSNSLIR 300 

NLLDQASHSA GTKAPALKGG AAELRPPDVG HVLRMLLGKM CAPRYPSGGA WDPHPGQGFG 360 

ESMYLLSDKA TSPLSLDAGL GQAPWSDLLL WALLLNRAQM AMYFWEMGSN AVSSALGACL 420 

LLRVMARLEP DAEEAARRKD LAFKFEGMGV DLFGECYRSS EVRAARLLLR RCPLWGDATC 480 

LQLAMQADAR AFFAQDGVQS LLTQKWWGDM ASTTPIWALV LAFFCPFLIY TRLITFRKSE 540 

EEPTREELEF DMDSVINGEG PVGTADPAEK TPLGVPRQSG RPGCCGGRCG GRRCLRRWFH 600 

FWGAPVTIFM GNWSYLLFL LLFSRVLLVD FQPAPPGSLE LLLYFWAFTL LCEELRQGLS 660 

GGGGSLASGG PGPGHASLSQ RLRLYLADSW NQCDLVALTC FLLGVGCRLT PGLYHLGRTV 720 

LCIDFMVFTV RLLHIFTVNK QLGPKIVIVS KMMKDVFFFL FFLGVWLVAY GVATEGLLRP 780 

RDSDFPSILR RVFYRPYLQI FGQIPQEDMD VALMEHSNCS SEPGFWAHPP GAOAGTCVSQ 840 

YANWLWLLL VIFLLVANIL LVNLLIAMFS YTFGKVQGNS DLYWKAQRYR LIREFHSRPA 900 

LAPPFIVISH LRLLLRQLCR RPRSPQPSSP ALEHFRVYLS KEAERKLLTW ESVHKENFLL 960 
ARARDKRES D SERLERTSQK VDLALKQLGH IREYEQRLKV LEREVQQCSR VLGWVT 

SEQ ID NO:107 PEW3 DNA SEQUENCE 

Nucleic Acid Accession*: NM_005982 

Coding sequence: 276-1 1 30 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

GGTAGCAGCA TCCACCGGGC GGGAGGTCGG AGGCAGCAAG GCCTTAAAGG CTACTGAGTG 60 

CGCCGGCCGT TCCGTGTCCA GAACCTCCCC TACTCCTCCG CCTTCTCTTC CTTGGCCGCC 120 

CACCGCCAAG TTCCGACTCC GGTTTTCGCC TTTGCAAAGC CTAAGGAGGA GGTTAGGAAC 180 

AGCCGCGCCC CCCTCCCTGC GGCCGCCGCC CCCTGCCTCT CGGCTCTGCT CCCTGCCGCG 240 

TGCGCCTGGG CCGTGCGCCC CGGCAGGCGC CAGCCATGTC GATGCTGCCG TCGTTTGGCT 300 

TTACGCAGGA GCAAGTGGCG TGCGTGTGCG AGGTTCTGCA GCAAGGCGGA AACCTGGAGC 360 

GCCTGGGCAG GTTCCTGTGG TCACTGCCCG CCTGCGACCA CCTGCACAAG AACGAGAGCG 420 

TACTCAAGGC CAAGGCGGTG GTCGCCTTCC ACCGCGGCAA CTTCCGTGAG CTCTACAAGA 480 

TCCTGGAGAG CCACCAGTTC TCGCCTCACA ACCACCCCAA ACTGCAGCAA CTGTGGCTGA 540 

AGGCGCATTA CGTGGAGGCC GAGAAGCTGC GCGGCCGACC CCTGGGCGCC GTGGGCAAAT 600 

ATCGGGTGCG CCGAAAATTT CCACTGCCGC GCACCATCTG GGACGGCGAG GAGACCAGCT 660 

ACTGCTTCAA GGAGAAGTCG AGGGGTGTCC TGCGGGAGTG GTACGCGCAC AATCCCTACC 720 

CATCGCCGCG TGAGAAGCGG GAGCTGGCCG AGGCCACCGG CCTCACCACC ACCCAGGTCA 780 

GCAACTGGTT TAAGAACCGG AGGCAAAGAG ACCGGGCCGC GGAGGCCAAG GAAAGGGAGA 840 

ACACCGAAAA CAATAACTCC TCCTCCAACA AGCAGAACCA ACTCTCTCCT CTGGAAGGGG 900 

GCAAGCCGCT CATGTCCAGC TCAGAAGAGG AATTCTCACC TCCCCAAAGT CCAGACCAGA 960 

ACTCGGTCCT TCTGCTGCAG GGCAATATGG GCCACGCCAG GAGCTCAAAC TATTCTCTCC 1020 

CGGGCTTAAC AGCCTCGCAG CCCAGTCACG GCCTGCAGAC CCACCAGCAT CAGCTCCAAG 1080 

ACTCTCTGCT CGGCCCCCTC ACCTCCAGTC TGGTGGACTT GGGGTCCTAA GTGGGGAGGG 1140 

ACTGGGGCCT CGAAGGGATT CCTGGAGCAG CAACCACTGC AGCGACTAGG GACACTTGTA 1200 

AATAGAAATC AGGAACATTT TTGCAGCTTG TTTCTGGAGT TGTTTGCGCA TAAAGGAATG 1260 

GTGGACTTTC ACAAATATCT TTTTAAAAAT CAAAACCAAC AGCGATCTCA AGCTTAATCT 1320 
CCTCTTCTCT CCAACTCTTT CCACTTTTGC ATTTTCCTTC CCAATGCAGA GATCAGGG 

SEQ ID NO:10fl PEW3 Protein sequence 
Protein Accession*: NP_005973 

1 11 21 31 41 51 

j t I I i I 

MSMLPSFGFT QEQVACVCEV LQQGGNLERL GRFLWSLPAC DHLHKNESVL KAKAWAFHR 60 
GNFRELYKIL ESHQFSPHNH PKLQQLWLKA HYVEAEKLRG RPLGAVGKYR VRRKF PLPRT 120 
IWDGEETSYC FKEKSRGVLR EWYAHNPYPS PREKRELAEA TGLTTTQVSN WFKNRRQRDR 180 
AAEAKERENT ENNNSSSNKQ NQLSPLEGGK PLMSSSEEEF SPPQSPDQNS VLLLQGNMGH 240 
ARSSNYSLPG LTASQPSHGL QTHQHQLQDS LLGPLTSSLV DLGS 

SEQ ID NO:109 PFJ8 DNA SEQUENCE 

Nucleic Acid Accession*: NM.005069 

Coding sequence: 57-2060 (underlined sequences correspond to start and stop codons) 
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Attorney Docket No.: 01850I-004200US 



I 11 21 31 41 51 
1 I I I E I 

GGGGCTCCGC GGGCCTGGAG CACGGCCGGG TCTAATATGC CCGGAGCCGA GGCGCGATG_A 60 
AGG AG AAGTC CAAGAATGCG GCCAAGACCA GGAGGGAGAA GG A A A ATGGC GAGTTTTACG 1 20 
5 AGCTTGCCAA GCTGCTCCCG CTGCCGTCGG CCATCACTTC GCAGCTGGAC AAAGCGTCCA 180 
TCATCCGCCT CACCACGAGC TACCTGAAGA TGCGCGCCGT CTTCCCCGAA GGTTTAGGAG 240 
ACGCGTGGGG ACAGCCGAGC CGCGCCGGGC CCCTGGACGG CGTCGCCAAG GAGCTGGGAT 300 
CGCACTTGCT GCAGACTTTG GATGGATTTG TTTTTGTGGT AGCATCTGAT GGCAAAATC A 360 
TGTATATATC CGAGACCGCT TCTGTCCATT TAGGCTTATC CCAGGTGGAG CTCACGGGCA 420 

1 0 AC AGTATTTA TGAATACATC C ATCCTTCTG ACCACGATGA G ATG ACCGCT GTCCTCACGG 480 
CCCACCAGCC GCTGCACCAC CACCTGCTCC A AG AGT ATG A GATAG AG AGG TCGTTCTTTC 540 
TTCGAATGAA ATGTGTCTTG GCG AAAAGGA ACGCGGGCCT GACCTGCAGC GGATACAAGG 600 
TC ATCCACTG CAGTGGCTAC TTG AAGATCA GGCAGTATAT GCTGGACATG TCCCTGTACG 660 
ACTCCTGCTA CCAG ATTGTG GGGCTGGTGG CCGTGGGCCA GTCGCTGCCA CCCAGTGCCA 720 

1 5 TCACCGAG AT CA AGCTGTAC AGTAACATGT TCATGTTCAG GGCCAGCCTT G ACCTG A AGC 780 
TG ATATTCCT GGATTCCAGG GTGACCGAGG TGACGGGTTA CGAGCCGCAG GACCTGATCG 840 
AGAAGACCCT ATACCATCAC GTGCACGGCT GCGACGTGTT CCACCTCCGC TACGCACACC 900 
ACCTCCTGTT GGTGAAGGGC CAGGTCACCA CCAAGTACTA CCGGCTGCTG TCCAAGCGGG 960 
GCGGCTGGGT GTGGGTGCAG AGCTACGCCA CCGTGGTGCA CAACAGCCGC TCGTCCCGGC 1020 

20 CCCACTGCAT CGTGAGTGTC AATTATGTAC TCACGGAGAT TGAATACAAG G AACTTCAGC 1080 
TGTCCCTGGA GCAGGTGTCC ACTGCCAAGT CCCAGGACTC CTGGAGGACC GCCTTGTCTA 1 140 
CCTCACAAGA AACTAGGAAA TTAGTGAAAC CCAAAAATAC CAAGATGAAG ACAAAGCTGA 1200 
GAACAAACCC TTACCCCCCA CAGCAATACA GCTCGTTCCA AATGGACAAA CTGGAATGCG 1260 

_ GCCAGCTCGG AAACTGGAG A GCCAGTCCCC CTGCAAGCGC TGCTGCTCCT CCAGAACTGC 1 320 

25 AGCCCCACTC AGAAAGCAGT GACCTTCTGT ACACGCCATC CTACAGCCTG CCCTTCTCCT 1380 
ACCATTACGG ACACTTCCCT CTGGACTCTC ACGTCTTCAG CAGCAAAAAG CCAATGTTGC 1440 
CGGCCAAGTT CGGGCAGCCC CAAGGATCCC CTTGTGAGGT GGCACGCTTT TTCCTGAGCA 1500 
CACTGCCAGC CAGCGGTGAA TGCCAGTGGC ATTATGCCAA CCCCCTAGTG CCTAGCAGCT 1560 
CGTCTCCAGC TAAAAATCCT CCAGAGCCAC CGGCGAACAC TGCTAGGCAC AGCCTGGTGC 1620 

30 CAAGCTACGA AGCGCCCGCC GCCGCCGTGC GCAGGTTCGG CGAGGACACC GCGCCCCCGA 1680 
GCTTCCCGAG CTGCGGCCAC TACCGCGAGG AGCCCGCGCT GGGCCCGGCC AA AGCCGCCC 1740 
GCCAGGCCGC CCGGGACGGG GCGCGGCTGG CGCTGGCCCG CGCGGCACCC GAGTGCTGCG 1800 
CGCCCCCGAC CCCCGAGGCC CCGGGCGCGC CGGCGCAGCT GCCCTTCGTG CTGCTCAACT 1860 
ACCACCGCGT GCTGGCCCGG CGCGGACCGC TGGGGGGCGC CGCACCCGCC GCCTCCGGCC 1920 

3 5 TGGCCTGCGC TCCCGGCGGC CCCG AGGCGG CGACCGGCGC GCTGCGGCTC CGGCACCCGA 1 980 
GCCCCGCCGC CACCTCCCCG CCCGGCGCGC CCCTGCCGCA CTACCTGGGC GCCTCGGTCA 2040 
TCATCACCAA CGGGAGGTGA CCCGCTGGCC GCCCGCGCCA GGAGCCTGGA CCCGGCCTCC 2100 
CGGGGCTGCG GCGCCACCGA GCCCGGCAAA TGCGCACGAC CTACATTAAT TTATGCAGAG 2160 
ACAGCTGTTT GAATTGGACC CCGCCGCCGA CTTGCGGATT TCCACCGCGG AGGCCCCGCG 2220 

40 CGCCGGTGCC G AGGGCCG AG GAGCGCCCGG GTCCGGGCAG GTGACCGCCC GCCTCTGTCC 2280 
TGCGAGGGCC GGTGCG ACCC AGTTGCTGGG GGCTTGGTTT CCTCACCTTG AAATCGGGCT 2340 
TC ACGCGTCT TGCCTTGTCC CCA ACGTTCC ACAAC AGTCC CGCTGGGGGA TTG AAGCGGT 2400 
TTCACTCCGC AAATATCCTC CACTTTCAGG AGGGAAAACC CACCCTACCA CAGTCCGCTC 2460 
TTCCAAGTGG ACGGCAGACC TGGGAGGGGA CGCCTGTGTC ACGAGCCCTT TTAGATGCTT 2520 

45 AGGTGAAGGC AGAAGTGATG ATTGTAAGTC CCATGAATAC ACAACTCCAC TGTCTTTAAA 2580 
AGTCATTCAA GAGTCTCATT ATTTTTGTTT TTATTTAACC CTTTCTTCAA TACAAAAAGC 2640 
CAACAAACCA AGACTAAGGG GGTGACCATG CAATTCCATT TTGTGTCTGT GAACATAGGT 2700 
GTGCTTCCCA AATACATTAA CAAGCTCTTA CTTCCCCCTA ACCCCTATGA ACTCTTGATA 2760 
ACACCAAGAG TAGCACCTTC AGAATATATT GAATAGGCAT TAAATGCAAA AATATATATG 2820 

50 TAGCCAG ACA GTTTATG AGA ATG ACCCTGT CAAGCTTCAT TATTACGTGG CAAAATCCCT 2880 
CTGGCCCACA CAGATCTGTA ATTCACTAGG CTCGTGTTTG CTACAAATAG TGCTAATAAA 2940 
GTTAAATTGC ACGTGCAATA CGGAACACTG TCAATGGACT GCACCTTGTG AAGGAAAAAC 3000 
ATGCTTAAGG GGGTGTAATG AAAATGATGT AGACATTTTA AGCATTTTCT ACACAGCGAG 3060 
AAAACTTCGT AAGAACATGT TACGTGTGCA ACAGGTAAAC AGAAATCCTT TCATAAAGCA 3120 

5 5 CCAGCAGTGT TTAAAAAATG AGCTTCCATT AATTTTTACT TTTTATGGGT TTTGCTTAAA 3180 
GATCTCAACA TGGAAAAATC CTGTCATGGC TCTGAACTGC ACAATGCATT G AACCGCCGT 3240 
CCTTCAATTT TCTTC ACACT ATCAACACTG C AGCATTTTG CTGCTTTATC AAAATGGTTT 3300 
ATTTTAGG AA ACTTTTTCCA CCTTTCTGAA TGGAAAGAGG TTTTCACAAA TGTTTTAAAC 3360 
TCATCGTTCT AAAATCAAGT GCACCTACAC CAACTGCTCT CAAAATGTGA ACTGACTTTT 3420 

60 TTTTTTTTTT TTTTGCCAAC CCTGTGTCAC TTAGTGAGGA CCTGACACAA TCCCTACAGG 3480 
GTGTCTGTCA GTGGGCCTCA TGGTAAGAGT CACAATTTGC AAATTTAGGA CCGTGGGTCA 3540 
TGCAGCG A AG GGGCTGGATG GTAGG AAGGG ATGTGCCCGC CTCTCCACGC ACTCAGCTAT 3600 
ACCTCATTC A C AGCTCCTTG TG AGTGTGTG CACAGG AAAT AAGCCGAGGG TATTATTTTT 3660 
TTATGTTCAT G AGTCTTGTA ATTAAACCGT G ATTCTTG AA AGGTGTAGGT TTG ATT ACTA 3720 

65 GGAGATACCA CCGACATTTT TC A ATA A AGT ACTGCAAAAT GCTTTTGTGT CTACCTTGTT 3780 
ATTAACTTTT GGGGCTGTAT TTAGTAAAAA TAAATCAAGG CTATCGGAGC AGTTCAATAA 3840 
CAAAGGTTAC TGTTGAGAAA AAAGACCCTA TCATAGATTT ACAAG 



70 



SEQ ID M0:1 1 0 PFJ8 Protein sequence: 
Protein Accession #: NP_005060.1 



1 11 2i 31 41 51 
75 | | | | | | 

MKEKSKNAAK TRREKENGEF YELAKLLPLP SAITSQLDKA SHRLTTSYL KMRAVFPEGL 60 
GDAWGQPSRA GPLDGVAKEL GSHLLQTLDG FVFVVASDGK IMYISETASV HLGLSQVELT 120 
GNSIYEYIHP SDHDEMTA VL TAHQPLHHHL LQEYEIERSF FLRMKCVLAK RNAGLTCSGY 1 80 
KVIHCSGYLK IRQYMLDMSL YDSCYQIVGL VAVGQSLPPS AITEIKLYSN MFMFRASLDL 240 
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KLIFLDSRVT EVTGYEPQDL IEKTLYHHVH GCDVFHLRYA HHLLLVKGQV TTKYYRLLSK 300 
RGGWVWVQSY ATVVHNSRSS RPHCIVSVNY VLTEIEYKEL QLSLEQVSTA KSQDSWRTAL 360 
STSQETRKLV KPKNTKMKTK LRTNPYPPQQ YSSFQMDKLE CGQLGNWRAS PPAS AAAPPE 420 
LQPHSESSDL LYTPSYSLPF SYHYGHFPLD SHVFSSKKPM LPAKFGQPQG SPCEVARFFL 480 
STLPASGECQ WHYANPLVPS SSSPAKNPPE PPANTARHSL VPSYEAPAAA VRRFGEDTAP 540 
PSFPSCGHYR EEPALGPAKA ARQAARDGAR LALARAAPEC CAPPTPEAPG APAQLPFVLL 600 
NYHRVLARRG PLGGAAPAAS GLACAPGGPE AATGALRLRH PSPAATSPPG APLPHYLGAS 660 
VIITNGR 



SEQ ID NO:111 PFJ7 DNA SEQUENCE 

Nucleic Acid Accession #: NM_006549 

Coding sequence: 1 -1 254 (underlined sequences correspond to start and stop codons) 

1 U 21 31 41 51 
1 I I I I 1 

ATG AACGGAC GCTGCATCTG CCCGTCCCTG CCCTACTCAC CCGTCAGCTC CCCGCAGTCC 60 
TCGCCTCGGC TGCCCCGGCG GCCGACAGTG GAGTCTCACC ACGTCTCCAT CACGGGTATG 120 
CAGGACTGTG TGCAGCTGAA TCAGTATACC CTGAAGGATG AAATTGGAAA GGGCTCCTAT 180 
GGTGTCGTCA AGTTGGCCTA C AATG A A AAT G ACAATACCT ACTATGCA AT GAAGGTGCTG 240 
TCCAAAAAGA AGCTGATCCG GCAGGCCGGC TTTCCACGTC GCCCTCCACC CCGAGGCACC 300 
CGGCCAGCTC CTGGAGGCTG C ATCCAGCCC AGGGGCCCCA TTGAGCAGGT GTACC AGGAA 360 
ATTGCCATCC TCAAG A AGCT GGACCACCCC A ATGTGGTGA AGCTGGTGGA GGTCCTGGAT 420 
GACCCCAATG AGGACCATCT GTACATGGTG TTCGAACTGG TCAACCAAGG GCCCGTG ATG 480 
G AAGTGCCCA CCCTCAAACC ACTCTCTG A A G ACCAGGCCC GTTTCTACTT CCAGGATCTG 540 
ATCAAAGGCA TCGAGTACTT ACACTACCAG AAGATCATCC ACCGTGACAT CAAACCTTCC 600 
AACCTCCTGG TCGGAGAAGA TGGGCACATC AAGATCGCTG ACTTTGGTGT GAGCAATGAA 660 
TTCAAGGGC A GTGACGCGCT CCTCTCCAAC ACCGTGGGCA CGCCOGCCTT CATGGCACCC 720 
GAGTCGCTCT CTGAGACCCG CAAGATCTTC TCTGGGAAGG CCTTGGATGT TTGGGCCATG 780 
GGTGTGACAC TATACTGCTT TGTCTTTGGC CAGTGCCCAT TCATGGACGA GCGGATCATG 840 
TGTTTACACA GTAAG ATCAA G AGTCAGGCC CTGGAATTTC CAGACCAGCC CGACATAGCT 900 
GAGGACTTGA AGGACCTGAT CACCCGTATG CTGGACAAGA ACCCCGAGTC G AGGATCGTG 960 
GTGCCGG AAA TCAAGCTGCA CCCCTGGGTC ACGAGGCATG GGGCGGAGCC GTTGCCGTCG 1020 
GAGGATGAGA ACTGCACGCT GGTCGAAGTG ACTGAAGAGG AGGTCGAGAA CTCAGTCAAA 1080 
CACATTCCCA GCTTGGCAAC CGTG ATCCTG GTGAAGACCA TGATACGTAA ACGCTCCTTT 1 140 
GGGAACCCAT TCGAGGGCAG CCGGCGGGAG GAACGCTCAC TGTCAGCGCC TGGAAACTTG 1200 
CTCACCAAAA AACCAACCAG GG A ATGTGAG TCCCTGTCTG AGCTCAAGAC CT AG AAA AT A 1260 
AGTCCCCTTC CTGCCTGTTG CAAAGTAACG TAAGAGTTCC CTCACCCGAG TGGATGCAGA 1320 
CGTTCTTGCT GTCAGCCACC TTCCTTCATA CACATAGCCA GCCCAGGGTG ACCAGAACGT 1380 
CCCAGGACAG ATGAGGCTTT GTGTCCTTAT GAGAGTGGGA GAACCTGGTG GGCACCCCTG 1440 
GTGCAGGTGC TGTGGTGGGT GGGGACCCCA CTGCCTTTCC CACTGAGCAC ATCATGGCTA 1500 
CCTGACTTGG TGGGAGTTCC ATTCAGTCAC TTCTGTTTCT TAAACATAGC TTTACTGAGG 1560 
TACAATTCAC ATACCATGTA ATTCACCCAC GGGAAGTGTA TGATTCAGTG GTTTCTAATA 1620 
CACACTTCTG CAGCCATTAC CACCGTCAAC TTTACGACAT TTTCATCAGC CCAAGAAGAC 1680 
ACCCTACACT CCTTAGCTGT CCCCATCCAA CTCCCCCACC CCAGTAACCA CTCAGAATAG 1740 
GTATGGATTT GCCTATTCTG G ACGTTTCGT ATAAATGGCG TCATACACTA AAAAAAAAAA 1800 
AAAA 



SEQ IP N0:112 PFJ7 Protein sequence: 
Protein Accession*: NP_006540.1 

1 11 21 31 41 51 
I I I I I I 

MNGRCICPSL PYSPVSSPQS SPRLPRRPTV ESHHVSITGM QDCVQLNQYT LKDEIGKGSY 60 
GWKLAYNEN DNTYYAMKVL SKKKLIRQAG FPRRPPPRGT RPAPGGCIQP RGPEEQVYQE 120 
IAILKKLDHP NVVKLVEVLD DPNEDHLYMV FELVNQGPVM EVPTLKPLSE DQARFYFQDL 180 
IKGIEYLHYQ KIIHRDIKPS NLLVGEDGH1 KIADFGVSNE FKGSDALLSN TVGTPAFMAP 240 
ESLSETRKIF SGKALDVWAM GVTLYCFVFG QCPFMDERIM CLHSKJKSQA LEFPDQPDIA 300 
EDLKDLITRM LUKNPESRIV VPEIKLHPWV TRHGAEPLPS EDENCTLVEV TEEEVENSVK 360 
HJPSLATVIL VKTM1RKRSF GNPFEGSRRE ERSLSAPGNL LTKKPTRECE SLSELKT 



SEQ ID N0:113 PFJ6 DNA SEQUENCE 

Nucleic Acid Accession #: NMJJ213 1 0 

Coding sequence: 1 -429 (underlined sequences correspond to start and stop codons) 

1 II 21 31 41 51 
I I 1 1 I I 

ATG AAACCTC TGATATGG AC ATGGTCAGAT GTTGAAGGCC AGAGGCCGGC TCTGCTCATC 60 

TGCACAGCTG CAGCAGGACC CACGC AGGG A GTTAAGGGTT ATGGCAAGCC CTTTG AGCCA 1 20 

AG AAGTGTGA AA AAC ATACA CTCTACTCCT GCTTACCCAG ATGCCACAAT GCACAG ACAA 1 80 

CTCCTGGCTC CGGTGGAAGG AAGGATGGCA GAGACATTGA ATCAGAAACT CCATGTTGCC 240 

AATGTGCTGG AAG ATGACCC CGGCTACCTA CCTCACGTCT ACAGCGAGGA AGGGGAGTGT 300 

GGAGGGGCCC CATCCCTCAG CTCTCTGGCC AGCTTGG A AC AGGAGTTGCA ACCTGATTTG 360 



342 



CTGGACTCTT TGGGTTCAAA AGCG ACTCCG TTTGAGGAAA TATATTCAGA GTCAGGTGTT 420 
CCTTCCTAA 



SEQ (0 NO:t14 PFJ6 Protein sequence: 
Protein Accession*: NP_068582.1 

i 11 21 31 41 51 
till!! 

MKPLIWTWSD VEGQRPALLI CTAAAGPTQG VKG YGKPFEP RSVKNIHSTP AYPDATMHRQ 60 
LLAPVEGRMA ETLNQKLHVA NVLEDDPGYL PHVYSEEGEC GGAPSLSSLA SLEQELQPDL 120 
LDSLGSKATP FEEIYSESGV PS 



SEQ 10 N0:115 PFJ5 DNA SEQUENCE 

Nucleic Acid Accession #: NMJXJ6361 

Cooing sequence: 1 31 -985 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I 1 

CGAATGCAGG CGACTTGCGA GCTGGGAGCG ATTTAAAACG CTTTGGATTC CCCCGGCCTG 60 
GGTGGGGAGA GCG AGCTGGG TGCCCCCTAG ATTCCCCGCC CCCGCACCTC ATGAGCCGAC 120 
CCTCGGCTCC ATGGAGCCCG GCAATTATGC CACCTTGGAT GGAGCCAAGG ATATCGAAGG 180 
CTTGCTGGGA GCGGGAGGGG GGCGGAATCT GGTCGCCCAC TCCCCTCTGA CCAGCCACCC 240 
AGCGGCGCCT ACGCTG ATGC CTGCTGTCAA CTATGCCCCC TTGGATCTGC CAGGCTCGGC 300 
GGAGCCGCCA AAGCAATGCC ACCCATGCCC TGGGGTGCCC CAGGGGACGT CCCCAGCTCC 360 
CGTGCCTTAT GGTTACTTTG GAGGCGGGTA CTACTCCTGC CGAGTGTCCC GGAGCTCGCT 420 
GAAACCCTGT GCCCAGGCAG CCACCCTGGC CGCGTACCCC GCGGAGACTC CCACGGCCGG 480 
GGAAGAGTAC CCCAGTCGCC CCACTGAGTT TGCCTTCTAT CCGGGATATC CGGGAACCTA 540 
CCACGCTATG GCCAGTTACC TGGACGTGTC TGTGGTGCAG ACTCTGGGTG CTCCTGGAGA 600 
ACCGCGACAT GACTCCCTGT TGCCTGTGG A CAGTTACCAG TCTTGGGCTC TCGCTGGTGG 660 
CTGGAACAGC CAGATGTGTT GCCAGGGAGA ACAGAACCCA CCAGGTCCCT TTTGGAAGGC 720 
AGCATTTGCA GACTCCAGCG GGCAGCACCC TCCTGACGCC TGCGCCTTTC GTCGCGGCCG 780 
CAAGAAACGC ATTCCGTACA GCAAGGGGCA GTTGCGGGAG CTGGAGCGGG AGTATGCGGC 840 
TAACAAGTTC ATCACCAAGG ACAAGAGGCG CAAGATCTCG GCAGCCACCA GCCTCTCGGA 900 
GCGCCAGATT ACCATCTGGT TTCAGAACCG CCGGGTCAAA GAGAAGAAGG TTCTCGCCAA 960 
GGTGAAGAAC AGCGCTACCC CTTAAG AG AT CTCCTTGCCT GGGTGGGAGG AGCGAAAGTG 1020 
GGGGTGTCCT GGGG AG ACC A GAAACCTGCC AAGCCCAGGC TGGGGCCAAG GACTCTGCTG 1080 
AGAGGCCCCT AGAGACAACA CCCTTCCCAG GCCACTGGCT GCTGGACTGT TCCTCAGGAG 1 140 
CGGCCTGGGT ACCCAGTATG TGCAGGGAGA CGGAACCCCA TGTGACAGGC CCACTCCACC 1200 
AGGGTTCCCA AAGAACCTGG CCCAGTCATA ATCATTCATC CTCACAGTGG CAATAATCAC 1260 
GATAACCAGT 



SEQ ID N0:116 PFJ5 Protein sequence: 
Protein Accession*: NP_006352.1 

I 11 21 31 41 51 
I I i I I I 

MEPGNYATLD GAKDIEGLLG AGGGRNLVAH SPLTSHPAAP TLMPAVNYAP LDLPGSAEPP 60 
KQCHPCPGVP QGTSPAPVPY GYFGGGYYSC RVSRSSLKPC AQAATLAAYP AETPTAGEEY 120 
PSRPTEFAFY PGYPGTYHAM ASYLDVSVVQ TLG APGEPRH DSLLPVDS YQ S WALAGG WNS 1 80 
QMCCQGEQNP PGPFWKAAFA DSSGQHPPDA CAFRRGRKKR IPYSKGQLRE LEREYAANKF 240 
ITKDKRRKJS AATSLSERQI TIWFQNRRVK EKKVLAKVKN SATP 



SEQ ID N0:117 PFJ4 DNA SEQUENCE 

Nucleic Acid Accession*: NM_005628 

Coding sequence: 591-2216 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I 1 I 

GTAACCGCTA CTCCCGGACA CCAGACCACC GCCTTCCGTA CACAGGGGCC CGCATCCCAC 60 
CCTCCCGGAC CTAAGAGCCT GGGTCCCCTG TTTCCGG AGG TCCGCTTCCC GGCCCCCAGA 120 
TTCTGGCATC CCAGCCCTCA GTGTCCAAGA CCCAGGCAGC CCGGGTCCCC GCCTCCCGG A 1 80 
TCCAGGCGTC CGGGATCTGC GCCACCAGAA CCTAGCCTCC TGCAGACCTC CGCCATCTGG 240 
GGGCACTCAA CCTCCTGGAG CCAAGGGCCC CACGTCCCAC CCAGAGAAAC TCTCGTATTC 300 
CCAGCTCCTA GGGCCAAGGA ACCCGGGCGC TCCGAACTCC CAGCTTTCGG ACATCTGGCA 360 
C ACGGGGC AG AGCAGAGAAG CTCAGCGCCC AGCCTGGGGA ATTTAAACAC TCCAGCTTCC 420 
AAGAGCCAAG G AACTTCAGT GCTGTG AACT CACAACTCTA AGG AGCCCTC CA AAGTTCCA 480 
GTCTCCAGGT GCTGTTACTC AACTCAGTCC TAGG AACGTC GGGTCCTGGG AAGG AGCCCA 540 
AGCGCTCCCA GCCAGCTTCC AGGCGCTAAG AAACCCCGGT GCTTCCCATC ATG_GTGGCCG 600 
ATCCTCCTCG AGACTCCAAG GGGCTCGCAG CGGCGGAGCC CACCGCCAAC GGGGGCCTGG 660 
CGCTGGCCTC CATCGAGGAC CAAGGCGCGG CAGCAGGCGG CTACTGCGGT TCCCGGGACC 720 
AGGTGCGCCG CTGCCTTCG A GCCAACCTGC TTGTGCTGCT GACAGTGGTG GCCGTGGTGG 780 
CCGGCGTGGC GCTGGGACTG GGGGTGTCGG GGGCCGGGGG TGCGCTGGCG TTGGGCCCGG 840 
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AGCGCTTGAG CGCCTTCGTC TTCCCGGGCG AGCTGCTGCT GCGTCTGCTG CGGATGATCA 900 
TCTTGCCGCT GGTGGTGTGC AGCTTGATCG GCGGCGCCGC CAGCCTGGAC CCCGGCGCGC 960 
TCGGCCGTCT GGGCGCCTGG GCGCTGCTCT TTTTCCTGGT C ACCACGCTG CTGGCGTCGG 1020 
CGCTCGG AGT GGGCTTGGCG CTGGCTCTGC AGCCGGGCGC CGCCTCCGCC GCCATCAACG 1080 
5 CCTCCGTGGG AGCCGCGGGC AGTGCCGAAA ATGCCCCCAG CAAGG AGGTG CTCG ATTCGT 1 1 40 
TCCTGGATCT TGCGAGAAAT ATCTTCCCTT CCAACCTGGT GTCAGCAGCC TTTCGCTCAT 1200 
ACTCTACCAC CTATGAAGAG AGGAATATCA CCGGAACCAG GGTGAAGGTG CCCGTGGGGC 1260 
AGGAGGTGGA GGGG ATGAAC ATCCTGGGCT TCGTAGTGTT TGCCATCGTC TTTGGTGTGG 1 320 
CGCTGCGG AA GCTGGGGCCT G AAGGGGAGC TGCTTATCCG CTTCTTCAAC TCCTTCAATG 1 380 
1 0 AGGCCACCAT GGTTCTGGTC TCCTGGATCA TGTGGTACGC CCCTGTGGGC ATCATGTTCC 1440 
TGGTGGCTGG CAAGATCGTG GAGATGGAGG ATGTGGGTTT ACTCTTTGCC CGCCTTGGCA 1500 
AGTAC ATTCT GTGCTGCCTG CTGGGTC ACG CCATCCATGG GCTCCTGGTA CTGCCCCTCA 1 560 
TCTACTTCCT CTTCACCCGC AAAAACCCCT ACCGCTTCCT GTGGGGCATC GTGACGCCGC 1620 
TGGCCACTGC CTTTGGGACC TCTTCCAGTT CCGCCACGCT GCCGCTGATG ATGAAGTGCG 1680 
1 5 TGGAGGAGAA TAATGGCGTG GCCAAGCACA TCAGCCGTTT CATCCTGCCC ATCGGCGCCA 1740 
CCGTCAACAT GGACGGTGCC GCGCTCTTCC AGTGCGTGGC CGCAGTGTTC ATTGCACAGC 1800 
TCAGCCAGCA GTCCTTGGAC TTCGTAAAGA TCATCACCAT CCTGGTCACG GCCACAGCGT 1860 
CCAGCGTGGG GGCAGCGGGC ATCCCTGCTG GAGGTGTCCT CACTCTGGCC ATCATCCTCG 1920 
AAGCAGTCAA CCTCCCGGTC GACCATATCT CCTTGATCCT GGCTGTGGAC TGGCTAGTCG 1980 
20 ACCGGTCCTG TACCGTCCTC AATGTAG AAG GTG ACGCTCT GGGGGCAGG A CTCCTCC A A A 2040 
ATTATGTGGA CCGTACGGAG TCGAGAAGCA CAGAGCCTGA GTTGATACAA GTGAAGAGTG 2100 
AGCTGCCCCT GGATCCGCTG CCAGTCCCCA CTGAGGAAGG AAACCCCCTC CTCAAACACT 2160 
ATCGGGGGCC CGCAGGGGAT GCCACGGTCG CCTCTGAGAA GGAATCAGTC ATG TAA ACCC 2220 
CGGGAGGG AC CTTCCCTGCC CTGCTGGGGG TGCTCTTTGG ACACTGGATT ATGAGGAATG 2280 
25 GATAAATGG A TGAGCTAGGG CTCTGGGGGT CTGCCTGCAC ACTCTGGGGA GCCAGGGGCC 2340 
CCAGCACCCT CCAGGACAGG AGATCTGGGA TGCCTGGCTG CTGG AGTACA TGTGTTCACA 2400 
AGGGTTACTC CTCAAAACCC CCAGTTCTCA CTCATGTCCC CAACTCAAGG CTAGAAAACA 2460 
GCAAGATGGA GAAATAATGT TCTGCTGCGT CCCCACCGTG ACCTGCCTGG CCTCCCCTGT 2520 
CTCAGGG AGC AGGTCACAGG TCACCATGGG GAATTCTAGC CCCCACTGGG GGGATGTTAC 2580 
30 AACACCATGC TGGTTATTTT GGCGGCTGTA GTTGTGGGGG GATGTGTGTG TGCACGTGTG 2640 
TGTGTGTGTG TGTGTGTGTG TGTGTGTGTG TTCTGTGACC TCCTGTCCCC ATGGTACGTC 2700 
CCACCCTGTC CCCAG ATCCC CTATTCCCTC C ACAATA AC A GA A ACACTCC CAGGGACTCT 2760 
GGGGAGAGGC TG AGG ACAAA TACCTGCTGT CACTCCAGAG GACATTTTTT TTAGCAATAA 2820 
AATTG AGTGT CAACTATTTA AAAAAAAAAA AAAAAA 
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SEQ ID N0:118 PFJ4 Protein sequence: 
Protein Accession #: NP__00561 9.1 



40 1 11 21 31 41 51 

i I I t i I 

MVADPPRDSK GLAAAEPTAN GGLALASIED QGAAAGGYCG SRDQVRRCLR ANLLVLLTVV 60 
A V VAGVALGL GVSGAGGALA LGPERLS AFV FPGELLLRLL RMIILPLVVC SLIGGAASLD 1 20 
PGALGRLGAW ALLFFLVTTL LAS ALG VGLA LALQPG AAS A AIN AS VG A AG S AEN APSKEV 1 80 

45 LDSFLDLARN IFPSNLVSAA FRSYSTTYEE RNITGTRVKV PVGQEVEGMN ILGLVVFAIV 240 

FGVALRKLGP EGELLIRFFN SFNEATMVLV SWIMWYAPVG IMFLVAGKIV EMEDVGLLFA 300 
RLGKYULCCL LGHAIHGLLV LPLIYFLFTR KNPYRFLWGI VTPLATAFGT SSSSATLPLM 360 
MKCVEENNGV AKHISRFTLP IGATVNMDGA ALFQCVAAVF IAQLSQQSLD FVKHTILVT 420 
AT AS SVG AAG IPAGGVLTLA IILEAVNLPV DHISLILAVD WLVDRSCTVL NVEGDALGAG 480 

50 LLQNYVDRTE SRSTEPELIQ VKSELPLDPL PVPTEEGNPL LKHYRGPAGD ATVASEKESV 540 
M 



55 SEQ ID N0:1 19 PFJ3 DNA SEQUENCE 

Nucleic Acid Accession #: NM_006708 

Coding sequence: 88-642 (underlined sequences correspond to start and stop codons) 

„ 1 U 21 31 41 51 

60 | | | | | | 

CTAGTTAAGG CGGCACAGGG CCGAGGCGTA GTGTGGGTGA CTCCTCCGTT CCTTGGGTCC 60 
CGTCGTCTGT GATACTGCAG TTCAGCCAIQ GCAGAACCGC AGCCCCCGTC CGGCGGCCTC '120 
ACGGACGAGG CCGCCCTCAG TTGCTGCTCC GACGCGGACC CCAGTACCAA GGATTTTCTA 180 
TTGCAGCAGA CCATGCTACG AGTG A AGG AT CCTAAG AAGT CACTGG ATTT TTATACTAGA 240 

65 GTTCTTGGAA TGACGCTAAT CCAAAAATGT GATTTTCCCA TTATGAAGTT TTCACTCTAC 300 

TTCTTGGCTT ATG AGGATAA AAATGACATC CCTAAAGAAA AAGATGAAAA AATAGCCTGG 360 
GCGCTCTCCA GAAAAGCTAC ACTTGAGCTG ACACACAATT GGGGCACTGA AGATGATGCG 420 
ACCCAG AGTT ACCACAATGG CAATTCAGAC CCTCGAGGAT TCGGTCATAT TGGAATTGCT 480 
GTTCCTGATG TATACAGTGC TTGTAAAAGG TTTGAAGAAC TGGGAGTCAA ATTTGTGAAG 540 

70 AAACCTGATG ATGGTAAAAT GAAAGGCCTG GCATTTATTC AAGATCCTGA TGGCTACTGG 600 
ATTGAAATTT TGAATCCTAA CAAAATGGCA ACCTTAATGXAGTGCTGTGA GAATTCTCCT 660 
TTGAGATTTC AGAAGAAAGG AAACAATGTG ATTCAAGATA TTTACATACC AGAAGCATCT 720 
AGGACTGATG GATCACTGTC CCGATTCAAA TTATTCTTCA GTCCATTTCC CCTTCCTATT 780 
TCAGCTGTTC CTTTTCACCT AACTGTTCAG TCATTCTGGT TTTCA AGCAG TGCTTTATCT 840 

75 CATGTCCTTG AATATAGTTG TGTAACTTTA nTTTTAGGT AATAATTAGA ACAGTTCCCT 900 
TCAGAGGCTG CATTTGCCTT CTTCTGCCAC CTAAATATTA CTTCCCTTCA AATCTGCCTT 960 
TGAATCATCA TTTTTAAAAA AAAATTAACA TGTTTTTGTT GTAGTTATCT TCTGGGGTTT 1020 
CAATTCCTCA G AAACAACTT TTTTCACAAC GG AAAGG AAA GAACACTAGT GTTCTTTCAG 1080 
TAAAGTACAA AGTGTTTATT TTACAAAAGA GTAGGTACTC TTGAGAGCAA TTCAAATCAT 1 140 
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GCTGACAAGG ATACTGATAG AAAAAGTGAT TTCTTCTTAT TATAAAGTAC ATTTAAAGTT 1200 
CAAGG ACTA A CCTTATTTAT TTGGG AAAGG GGAGGAGGAA GGAAATGATA TGGTACCCAG 1 260 
ACACTGGGCT AGGCTGCAAC TTTATCTCAT TTAATACTCC CAGCTGTC AT GTG AG AAAG A 1 320 
A AGCAGGCTA GGC ATGTG AA ATCACTTTCA TGG ATTATTA ATGG ATTTAA G AGGGCATCA 1 380 
5 ATCAGCTCAA CTCAAGATTT CATAATCATT TTTAGTATTT AGATTGTGCC TCAAAGTTGT 1440 
AGTACCTCAC AATACCTCCA CTGGTTTCCT GTTGTAAAAA CCTTCAGTGA GTTTGACCAT 1500 
TGTGCTCTTG GCTCTTGGGC TGG AGTACCG TGGTGAGGG A GTAA ACACTA G A AGTCTTTA 1 560 
GTACAAAACT GCTCTAGGGA CACCTGGTGA TTCCTACACA AGTGATGTTT ATATTTCTCA 1620 
TAAAGAGTCT TCCCTATCCC AAGGTCTTCA TGATGCCAGT AGCCATATAT GATAAATTAT 1680 

1 0 GTTCAGTGAT AACTTAGTTA TCAGAAATCA GCTCAGTGGT CTTCCCCGCC ATGATTCACA 1 740 
TTTGATG AGT TTTTA A A A AT CAAAGTGATT TTGAAAATCT CTAATGGCTC AGAAAATAAA 1 800 
AACATCCAGT TTGTGG ATG A CTATATTTAG ATTTCTCTAG ACTCTAGTGG AAGACCTTTG 1860 
GAAAGGCCAT GCCAACCGTG CTTGTACTGC TAGAAGCACT TTATGTTTCC TTTTTGGGTG 1920 
AAATGGATTT ATGTG AGTGC TTTAAACAAA TAGCAATACT TATAGACTG A AATAAAATGA 1980 

1 5 A ACTTCAAAT A AG 



SEQ ID NO:120 PFJ3 Protein sequence: 
Protein Accession #: NP_006699. 1 

1 11 21 31 41 51 
I I I I I I 

MAEPQPPSGG LTDEAALSCC SDADPSTKDF LLQQTMLR VK DPKKSLDFYT RVLGMTLIQK 60 
CDFPIMKFSL Y FLAYED KND IPKEKDEKIA WALSRKATLE LTHNWGTEDD ATQS YHNGNS 1 20 
25 DPRGFGHIGI AVPDVYSACK RFEELGVKFV KKPDDGKMKG LAFIQDPDGY WE1LNPNKM 180 
ATLM 



30 SEQ ID N0:121 PFJ2 DNA SEQUENCE 

Nucleic Acid Accession #: NM_002867 

Coding sequence: 70-729 (underlined sequences correspond to start and stop codons) 

„ 1 11 21 31 41 51 
35 | 1 | | | } 

CCGACGCCAG GTCCTGCCGT CCCGCCGACC GTCCGGGAGC G AACCCGTCG TCCCGCACTG 60 
GAGTCCGCGAJTCGCTTCAGT GACAGATGGT AAACATGGAG TCAAAGATGC CTCTGACCAG 120 
A ATTTTG ACT AC ATGTTTAA ACTGCTTATC ATTGGC A ACA GC AGTGTTGG C AAG ACCTCC 1 80 
TTCCTCTTGC GCTATGCTGA TG ACACGTTC ACCCCAGCCT TCGTTAGCAC CGTGGGCATC 240 
40 GACTTCAAGG TG A AG AC AGT CTACCGTCAC GAGAAGCGGG TGAAACTGCA GATCTGGGAC 300 
ACAGCTGGGC AGGAGCGGTA CCGGACCATC ACAACAGCCT ATTACCGTGG GGCCATGGGC 360 
TTCATTCTGA TGTATGACAT CACCAATGAA GAGTCCTTCA ATGCTGTCCA AGACTGGGCT 420 
ACTCAGATCA AGACCTACTC CTGGGACAAT GCACAAGTTA TTCTGGTGGG GAACAAGTGT 480 
GACATGGAGG AAGAGAGGGT TGTTCCCACT GAGAAGGGCC AGCTCCTTGC AGAGCAGCTT 540 
45 GGGTTTGATT TCTTTG AAGC CAGTGCAAAG GAGAACATCA GTGTAAGGCA GGCCTTTGAG 600 
CGCCTGGTGG ATGCCATTTG TGACAAGATG TCTGATTCGC TGGACACAGA CCCGTCGATG 660 
CTGGGCTCCT CCAAGAAC AC GCGTCTCTCG GACACCCCAC CGCTGCTGCA GCAG AACTGC 720 
TCATG CTAG C AAGGCCCACC TTCCTGACCT CCCCTCATTG TGGCCCCAC A CCCAAGTCTG 780 
CTTCTCCCTG TTACACACTG TCCGCTCT 



50 

55 



SEQ ID WO:122 PFJ2 Protein sequence: 
Protein Accession*: NP_002858.1 



1 11 21 31 41 51 
I I I I I I 

MASVTDGKHG VKDASDQNFD YMFKLLIIGN SSVGKTSFLL RYADDTFTPA FVSTVGIDFK 60 
VKTV YRHEKR VKLQIWDTAG QERYRT1TTA YYRGAMGFIL MYDITNEESF NAVQDW ATQI 1 20 
KTYSWDN AQV ILVGNKCDME EERVVPTEKG QLLAEQLGFD FFEAS AKENI S VRQ AFERLV 1 80 
OU DAICDKMSDS LDTDPSMLGS SKNTRLSDTP PLLQQNCSC 



65 SEQ ID NO;123 PFJ1 DNA SEQUENCE 

Nucleic Acid Accession*: NM_001844 

Coding sequence: 1 58-4621 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

70 | 1 | | | E 

ACGCAGAGCG CTGCTGGGCT GCCGGGTCTC CCGCTTCCTC CTCCTGCTCC AAGGGCCTCC 60 
TGCATGAGGG CGCGGTAGAG ACCCGGACCC GCGCCGTGCT CCTGCCGTTT CGCTGCGCTC 120 
CGCCCGGGCC CGGCTCAGCC AGGCCCCGCG GTGAGCCAJTQ ATTCGCCTCG GGGCTCCCCA 180 
GTCGCTGGTG CTGCTGACGC TGCTCGTCGC CGCTGTCCTT CGGTGTC AGG GCCAGGATGT 240 

7 5 CC AGG AGGCT GGC AGCTGTG TGC AGG ATGG GCAG AGGTAT AATGATAAGG ATGTGTGG AA 300 
GCCGGAGCCC TGCCGGATCT GTGTCTGTGA CACTGGGACT GTCCTCTGCG ACGACATAAT 360 
CTGTGAAGAC GTGAAAGACT GCCTCAGCCC TGAGATCCCC TTCGGAGAGT GCTGCCCCAT 420 
' CTGCCCAACT GACCTCGCCA CTGCCAGTGG GCAACCAGGA CCAAAGGGAC AGAAAGGAGA 480 
ACCTGGAGAC ATCAAGG ATA TTGTAGGACC CAAAGGACCT CCTGGGCCTC AGGGACCTGC 540 
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AGGGGAACAA GGACCCAG AG GGGATCGTGG TG ACAAAGGT GAAAAAGGTG CCCCTGGACC 600 
TCGTGGCAGA GATGGAGAAC CTGGGACCCC TGGAAATCCT GGCCCCCCTG GTCCTCCCGG 660 
CCCCCCTGGT CCCCCTGGTC TTGGTGGAAA CTTTGCTGCC CAGATGGCTG GAGG ATTTG A 720 
TGAAAAGGCT GGTGGCGCCC AGTTGGGAGT AATGCAAGGA CCAATGGGCC CCATGGG ACC 780 
5 TCGAGGACCT CCAGGCCCTG CAGGTGCTCC TGGGCCTCAA GGATTTCAAG GCAATCCTGG 840 
TGAACCTGGT GAACCTGGTG TCTCTGGTCC CATGGGTCCC CGTGGTCCTC CTGGTCCCCC 900 
TGGAAAGCCT GGTGATGATG GTGAAGCTGG AAAACCTGGA AAAGCTGGTG AAAGGGGTCC 960 
GCCTGGTCCT CAGGGTGCTC GTGGTTTCCC AGGAACCCCA GGCCTTCCTG GTGTCAAAGG 1020 
TCACAGAGGT TATCCAGGCC TGGACGGTGC TAAGGGAGAG GCGGGTGCTC CTGGTGTGAA 1080 

1 0 GGGTG AG AGT GGTTCCCCGG GTGAGAACGG ATCTCCGGGC CCAATGGGTC CTCGTGGCCT 1 140 
GCCTGGTGAA AGAGGACGGA CTGGCCCTGC TGGCGCTGCG GGTGCCCGAG GCAACGATGG 1200 
TCAGCCAGGC CCCGCAGGTC CTCCGGGTCC TGTCGGTCCT GCTGGTGGTC CTGGCTTCCC 1260 
TGGTGCTCCT GGAGCCAAGG GTG AAGCCGG CCCCACTGGT GCCCGTGGTC CTGAAGGTGC 1320 
TCAAGGTCCT CGCGGTGAAC CTGGTACTCC TGGGTCCCCT GGGCCTGCTG GTGCCTCCGG 1380 

1 5 TAACOCTGGA ACAGATGGAA TTCCTGGAGC CAAAGGATCT GCTGGTGCTC CTGGCATTGC 1440 

TGGTGCTCCT GGCTTCCCTG GGCCACGGGG TCCTCCTGGC CCTCAAGGTG CAACTGGTCC 1500 
TCTGGGCCCG AAAGGTCAGA CGGGTGAACC TGGTATTGCT GGCTTCAAAG GTGAACAAGG 1560 
CCCCAAGGGA GAACCTGGCC CTGCTGGCCC CCAGGGAGCC CCTGGACCCG CTGGTGAAGA 1620 
AGGCAAGAGA GGTGCCCGTG GAGAGCCTGG TGGCGTTGGG CCCATCGGTC CCCCTGGAGA 1680 

20 AAGAGGTGCT CCCGGAAACC GCGGTTTOCC AGGTCAAGAT GGTCTGGCAG GTCCCAAGGG 1740 
AGCCCCTGGA GAGCG AGGGC CCAGTGGTCT TGCTGGCCCC AAGGGAGCCA ACGGTGACCC 1800 
TGGCCGTCCT GGAGAACCTG GCCTTCCTGG AGCCCGGGGT CTCACTGGCC GCCCTGGTGA I860 
TGCTGGTCCT CAAGGCAAAG TTGGCCCTTC TGG AGCCCCT GGTGAAGATG GTCGTCCTGG 1920 

; ACCTCC AGGT CCTCAGGGGG CTCGTGGGCA GCCTGGTGTC ATGGGTTTCC CTGGCCCCAA 1 980 

25 AGGTGCCAAC GGTGAGCCTG GCAAAGCTGG TGAGAAGGGA CTGCCTGGTG CTCCTGGTCT 2040 
GAGGGGTCTT CCTGGCAAAG ATGGTGAGAC AGGTGCTGCA GGACCCCCTG GCCCTGCTGG 2100 
ACCTGCTGGT G A ACGAGGCG AGCAGGGTGC TCCTGGGCCA TCTGGGTTCC AGGGACTTCC 2160 
TGGCCCTCCT GGTCCCCCAG GTG AAGGTGG AAAACCAGGT GACCAGGGTG TTCCCGGTGA 2220 
AGCTGG AGCC CCTGGCCTCG TGGGTCCCAG GGGTG AACG A GGTTTCCCAG GTGAACGTGG 2280 

30 CTCTCCCGGT GCCCAGGGCC TCCAGGGTCC CCGTGGCCTC CCCGGCACTC CTGGCACTGA 2340 
TGGTCCCAAA GGTGCATCTG GCCCAGC AGG CCCCCCTGGC GC ACAGGGCC CTCCAGGTCT 2400 
TCAGGGAATG CCTGGCGAGA GGGGAGCAGC TGGTATCGCT GGGCCCAAAG GCGACAGGGG 2460 
TGACGTTGGT GAGAAAGGCC CTGAGGGAGC CCCTGGAAAG GATGGTGGAC GAGGCCTGAC 2520 
AGGTCCCATT GGCCCCCCTG GCCCAGCTGG TGCTAACGGC GAGAAGGGAG AAGTTGGACC 2580 

3 5 TCCTGGTCCT GCAGG AAGTG CTGGTGCTCG TGGCGCTCCG GGTG AACGTG GAG AG ACTGG 2640 
CCCCCCCGGA CCAGCGGG AT TTGCTGGGCC TCCTGGTGCT G ATGGCCAGC CTGGGGCCAA 2700 
GGGTGAGCAA GGAGAGGCCG GCCAGAAAGG CGATGCTGGT GCCCCTGGTC CTCAGGGCCC 2760 
CTCTGGAGCA CCTGGGCCTC AGGGTCCTAC TGG AGTGACT GGTCCTAAAG GAGCCCGAGG 2820 
TGCCCAAGGC CCCCCGGGAG CCACTGGATT CCCTGGAGCT GCTGGCCGCG TTGG ACCCCC 2880 

40 AGGCTCCAAT GGCAACCCTG GACCCCCTGG TCCCCCTGGT CCTTCTGGAA AAGATGGTCC 2940 
CAAAGGTGCT CGAGGAGACA GCGGCCCCCC TGGCCGAGCT GGTGAACCCG GCCTCCAAGG 3000 
TCCTGCTGGA CCCCCTGGCG AGAAGGGAGA GCCTGGAGAT G ACGGTCCCT CTGGTGCCGA 3060 
AGGTCCACCA GGTCCCCAGG GTCTGGCTGG TCAGAGAGGC ATCGTCGGTC TGCCTGGGCA 3120 
ACGTGGTG AG AG AGG ATTCC CTGGCTTGCC TGGCCC ATCG GGTG AGCCCG GCAAGCAGGG 3 1 80 

45 TGCTCCTGGA GCATCTGG AG ACAGAGGTCC TCCTGGCCCC GTGGGTCCTC CTGGCCTGAC 3240 
GGGTCCTGCA GGTGAACCCG GACGAGAGGG AAGCCCCGGT GCTGATGGCC CCCCTGGCAG 3300 
AGATGGCGCT GCTGGAGTCA AGGGTGATCG TGGTGAGACT GGTGCTGTGG GAGCTCCTGG 3360 
AGCCCCTGGG CCCCCTGGCT CCCCTGGCCC CGCTGGTCCA ACTGGCAAGC AAGGAGACAG 3420 
AGGAGAAGCT GGTGCACAAG GCCCCATGGG ACCCTC AGG A CCAGCTGGAG CCCGGGG AAT 3480 

50 CCAGGGTCCT CAAGGCCCCA GAGGTGACAA AGGAGAGGCT GGAGAGCCTG GCGAGAGAGG 3540 
CCTGAAGGGA CACCGTGGCT TCACTGGTCT GCAGGGTCTG CCCGGCCCTC CTGGTCCTTC 3600 
TGGAGACCAA GGTGCTTCTG GTCCTGCTGG TCCTTCTGGC CCTAG AGGTC CTCCTGGCCC 3660 
CGTCGGTCCC TCTGGCAAAG ATGGTGCTAA TGGAATCCCT GGCCCCATTG GGCCTCCTGG 3720 
TCCCCGTGG A CGATC AGGCG AAACCGGTCC TGCTGGTCCT CCTGGAAATC CTGGGCCCCC 3780 

5 5 TGGTCCTCCA GGTCCCCCTG GCCCTGGCAT CGACATGTCC GCCTTTGCTG GCTTAGGCCC 3840 

GAGAGAGAAG GGCCCCGACC CCCTGCAGTA CATGCGGGCC GACCAGGCAG CCGGTGGCCT 3900 
GAGACAGCAT GACGCCGAGG TGGATGCCAC ACTCAAGTCC CTCAACAACC AGATTGAGAG 3960 
CATCCGCAGC CCCGAGGGCT CCCGCAAGAA CCCTGCTCGC ACCTGCAGAG ACCTGAAACT 4020 
CTGCCACCCT GAGTGGAAGA GTGGAGACTA CTGGATTGAC CCCAACCAAG GCTGCACCTT 4080 

60 GGACGCCATG AAGGTTTTCT GCAACATGGA GACTGGCGAG ACTTGCGTCT ACCCCAATCC 4140 

AGCAAACGTT CCCAAGAAGA ACTGGTGGAG CAGCAAG AGC AAGGAGAAGA AACACATCTG 4200 
GTTTGG AG A A ACCATCAATG GTGGCTTCCA TTTCAGCTAT GGAGATGACA ATCTGGCTCC 4260 
CAACACTGCC AACGTCCAGA TGACCTTCCT ACGCCTGCTG TCCACGGAAG GCTCCCAGAA 4320 
CATCACCTAC CACTGCAAGA ACAGC ATTGC CTATCTGGAC GAAGC AGCTG GCAACCTC AA 4380 

65 GAAGGCCCTG CTCATCCAGG GCTCCAATGA CGTGGAGATC CGGGCAGAGG GCAATAGCAG 4440 
GTTCACGTAC ACTGCCCTGA AGGATGGCTG CACGAAACAT ACCGGTAAGT GGGGCAAGAC 4500 
TGTTATCGAG TACCGGTCAC AGAAGACCTC ACGCCTCCCC ATCATTGACA TTGCACCCAT 4560 
GG AC AT AGG A GGGCCCGAGC AGGAATTCGG TGTGGACATA GGGCCGGTCT GCTTCTTGTA 4620 
AAAACCTGAA CCC AG A A AC A ACACAATCCG TTGCAAACCC AAAGGACCCA AGTACTTTCC 4680 

70 AATCTCAGTC ACTCTAGGAC TCTGCACTGA ATGGCTGACC TGACCTGATG TCCATTCATC 4740 
CCACCCTCTC ACAGTTCGGA CTTTTCTCCC CTCTCTTTCT AAGAGACCTG AACTGGGCAG 4800 
ACTGCAAAAT AAAATCTCGG TGTTCTATTT ATTTATTGTC TTCCTGTAAG ACCTTCGGGT 4860 
CAAGGCAGAG GCAGGAAACT AACTGGTGTG AGTCAAATGC CCCCTGAGTG ACTGCCCCCA 4920 
GCCCAGGCC A GAAG ACCTCC CTTCAGGTGC CGGGCGCAGG AACTGTGTGT GTCCTACACA 4980 

75 ATGGTGCTAT TCTGTGTCAA ACACCTCTGT ATTTTTTAAA ACATCAATTG ATATTAAAAA 5040 
TG AAA AG ATT ATTGGAAAGT 



SEQ 10 NO:124 PFJ1 Protein sequence: 
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P/oJan Accession #: 



NPJJ01835.2 



I li 21 31 41 51 
1 I I I I I 

MIRLGAPQSL VLLTLLVAAV LRCQGQDVQE AGSCVQDGQR YNDKDVWKPE PCRICVCDTG 60 
TVLCDDUCE DVKDCLSPEI PFGECCPICP TDLATASGQP GPKGQKGEPG DIKDIVGPKG 120 
PPGPQGPAGE QGPRGDRGDK GEKG APGPRG RDGEPGTPGN PGPPGPPGPP GPPGLGGNFA 1 80 
AQMAGGFDEK AGGAQLGVMQ GPMGPMGPRG PPGPAGAPGP QGFQGNPGEP GEPGVSGPMG 240 
PRGPPGPPGK PGDDGEAGKP GKAGERGPPG PQGARGFPGT PGLPGVKGHR GYPGLDGAKG 300 
EAGAPGVKGE SGSPGENGSP GPMGPRGLPG ERGRTGPAGA AGARGNDGQP GPAGPPGPVG 360 
PAGGPGFPG A PG AKGEAGPT GARGPEGAQG PRGEPGTPGS PGPAGASGNP GTDGIPGAKG 420 
SAGAPGIAGA PGFPGPRGPP GPQGATGPLG PKGQTGEPGI AGFKGEQGPK GEPGPAGPQG 480 
APGPAGEEGK RGARGEPGGV GPIGPPGERG APGNRGFPGQ DGLAGPKGAP GERGPSGLAG 540 
PKGANGDPGR PGEPGLPG AR GLTGRPGDAG PQGKVGPSGA PGEDGRPGPP GPQGARGQPG 600 
VMGFPGPKGA NGEPGKAGEK GLPGAPGLRG LPGKDGETGA AGPPGPAGPA GERGEQGAPG 660 
PSGFQGLPGP PGPPGEGGKP GDQGVPGEAG APGLVGPRGE RGFPGERGSP GAQGLQGPRG 720 
LPGTPGTDGP KG ASGPAGPP G AQGPPGLQG MPGERGA AGI AGPKGDRGD V GEKGPEGAPG 780 
KDGGRGLTGP IGPPGPAGAN GEKGEVGPPG PAGSAGARGA PGERGETGPP GPAGFAGPPG 840 
ADGQPGAKGE QGEAGQKGDA GAPGPQGPSG APGPQGPTGV TGPKGARGAQ GPPGATGFPG 900 
AAGRVGPPGS NGNPGPPGPP GPSGKDGPKG ARGDSGPPGR AGEPGLQGPA GPPGEKGEPG 960 
DDGPSG AEGP PGPQGLAGQR GIVGLPGQRG ERGFPGLPGP SGEPGKQGAP GASGDRGPPG 1020 
PVGPPGLTGP AGEPGREGSP GADGPPGRDG A AG VKGDRGE TGAVGAPGAP GPPGSPGPAG 1080 
PTGKQGDRGE AGAQGPMGPS GPAGARGIQG PQGPRGDKGE AGEPGERGLK GHRGFTGLQG 1 140 
LPGPPGPSGD QGASGPAGPS GPRGPPGPVG PSGKDGANGI PGPIGPPGPR GRSGETGPAG 1200 
PPGNPGPPGP PGPPGPGIDM SAFAGLGPRE KGPDPLQYMR ADQAAGGLRQ HDAEVDATLK 1260 
SLNNQIESIR SPEGSRKNPA RTCRDLKLCH PEWKSGDYWI DPNQGCTLDA MKVFCNMETG 1320 
ETCVYPNPAN VPKKNWWSSK SKEKKHIWFG ETINGGFHFS YGDDNLAPNT ANVQMTFLRL 1380 
LSTEGSQNIT YHCKNSIAYL DEAAGNLKKA LLIQGSNDVE IRAEGNSRFT YTALKDGCTK 1440 
HTGKWGKTVI EYRSQKTSRL PIIDIAPMDI GGPEQEFGVD IGPVCFL 



SEQ ID NO:125 PFH9 DNA SEQUENCE 

Nucleic Acid Accession #: NM .005084 

Coding sequence: 1 62-1 487(under1ined sequences corres pond to start and stop codons) 

1 11 21 31 41 51 
I 1 I 1 I I 

GCTGGTCGGA GGCTCGCAGT GCTGTCGGCG AGAAGCAGTC GGGTTTGG AG CGCTTGGGTC 60 
GCGTTGGTGC GCGGTGGAAC GCGCCCAGGG ACCCCAGTTC CCGCGAGCAG CTCCGCGCCG 120 
CGCCTGAGAG ACTAAGCTG A A ACTGCTGCT CAGCTCCCAA GATGGTGCCA CCCA AATTGC 1 80 
ATGTGCTTTT CTGCCTCTGC GGCTGCCTGG CTGTGGTTTA TCCTTTTG AC TGGC AATACA 240 
TAAATCCTGT TGCCCATATG AAATCATCAG CATGGGTCAA CAAAATACAA GTACTGATGG 300 
CTGCTGCAAG CTTTGGCCAA ACTAAAATCC CCCGGGGAAA TGGGCCTTAT TCCGTTGGTT 360 
GTACAG ACTT AATGTTTG AT CACACTAATA AGGGCACCTT CTTGCGTTTA TATTATCCAT 420 
CCCAAGATAA TGATCGCCTT GACACCCTTT GGATCCCAAA TAAAGAATAT TTTTGGGGTC 480 
TTAGCAAATT TCTTGGAACA CACTGGCTTA TGGGCAACAT TTTGAGGTTA CTCTTTGGTT 540 
CAATGACAAC TCCTGCAAAC TGGAATTCCC CTCTGAGGCC TGGTGAAAAA TATCCACTTG 600 
TTGTTTTTTC TCATGGTCTT GGGGCATTCA GGACACTTTA TTCTGCTATT GGCATTGACC 660 
TGGC ATCTCA TGGGTTTATA GTTGCTGCTG TAGAACACAG AGATAGATCT GCATCTGCAA 720 
CTTACTATTT CAAGGACCAA TCTGCTGCAG AAATAGGGGA CAAGTCTTGG CTCTACCTTA 780 
GAACCCTGAA ACAAGAGGAG GAGACACATA TACGAAATGA GCAGGTACGG CAAAGAGCAA 840 
AAGAATGTTC CCAAGCTCTC AGTCTGATTC TTGACATTGA TCATGGAAAG CCAGTGAAGA 900 
ATGCATTAGA TTTAAAGTTT GATATGG AAC AACTGAAGGA CTCTATTGAT AGGGAAAAAA 960 
TAGCAGTAAT TGG AC ATTCT TTTGGTGG AG CAACGGTTAT TCAGACTCTT AGTGAAGATC 1020 
AGAGATTCAG ATGTGGTATT GCCCTGGATG CATGGATGTT TCCACTGGGT GATGAAGTAT 1080 
ATTCCAG AAT TCCTCAGCCC CTCTTTTTTA TCAACTCTGA ATATTTCCAA TATCCTGCTA 1 140 
ATATCATAAA A ATG A A A AAA TGCTACTCAC CTGATAAAGA AAGAAAGATG ATTACAATCA 1200 
GGGGTTCAGT CCACCAGAAT TTTGCTGACT TCACTTTTGC AACTGGCAAA ATAATTGGAC 1260 
ACATGCTCAA ATTAAAGGGA GACATAGATT CAAATGTAGC TATTGATCTT AGCAACAAAG 1320 
CTTCATTAGC ATTCTTACAA AAGCATTTAG GACTTCATAA AGATTTTGAT CAGTGGGACT 1380 
GCTTGATTGA AGG AG ATG AT GAGAATCTTA TTCCAGGGAC CAACATTAAC ACAACCAATC 1440 
AACACATCAT GTTACAGAAC TCTTCAGGAA TAGAGAAATA CAA TTAG GAT TAAAATAGGT 1500 
TTTTT 



SEQ ID NO:126 PFH9 Protein sequence: 
Protein Accession #: NP_005075. 1 

1 11 21 31 41 51 
I I I I I I 

MVPPKLHVLF CLCGCLA VVY PFDWQYINPV AHMKSSAWVN KIQVLMAAAS FGQTKIPRGN 60 
GPYS VGCTDL MFDHTNKGTF LRLYYPSQDN DRLDTLWIPN KEYFWGLSKF LGTHWLMGNI 1 20 
LRLLFGSMTT PANWNSPLRP GEKYPLVVFS HGLGAFRTLY SAIGEDLASH GFIVAAVEHR 180 
DRSASATYYF KDQSAAE1GD KSWLYLRTLK QEEETHIRNE QVRQRAKECS QALSLILDID 240 
HGKPVKNALD LKFDMEQLKD SIDREKIAVI GHSFGGATVI QTLSEDQRFR CGIALDAWMF 300 
PLGDEV YSRI PQPLFFINSE YFQYPANIIK MKKCYSPDKE RKMIT1RGSV HQNFADFTFA 360 
TGKIIGHMLK LKGD1DSNVA IDLSNKASLA FLQKHLGLHK DFDQWDCLIE GDDENLIPGT 420 
NINTTNQHIM LQNSSGIEKY N 
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SEQ ID NO:127 PFH8 DNA SEQUENCE 

5 Nucleic Acid Accession #: NMJJ15900 

Coding sequence: 32- 1402 {undefined sequences correspond lo start and stop codons) 

1 11 21 31 41 51 

1 0 CACG AGCGGC ACGAGG ATTT CCAGCTCAGC G ATGCCCCCA GGTCCCTGGG AG AGCTGCTT 60 
CTGGGTGGGG GGCCTCATTT TGTGGCTCAG CGTTGG A AGT TCAGGGG ATG CACCTCCTAC 1 20 
CCCACAGCCA AAGTGCGCTG ACTTCCAGAG CGCCAACCTT TTTGAAGGCA CCGATCTCAA 180 
AGTCCAGTTT CTCCTCTTTG TCCCTTCGAA TCCTAGCTGT GGGCAGCTAG TAGAAGGAAG 240 
CAGTGACCTC CAAAACTCTG GGTTCAATGC C ACTCTGGG A ACCAAACTAA TTATCCATGG 300 
1 5 ATTCAGGGTT TTAGG AACAA AGCCTTCCTG GATTGACACA TTTATTAGAA CCCTTCTGCG 360 
TGCAACGAAT GCTAATGTGA TTGCCGTGG A CTGGATTTAT GGGTCTACAG GAGTCTACTT 420 
CTCAGCTGTG AAAAATGTGA TTAAGTTGAG CCTCGAG ATC TCCCTTTTCC TCAATAAACT 480 
CCTGGTGCTG GGTGTGTCGG A ATCCTCAAT CCACATCATT GGTGTTAGCC TGGGGGCCC A 540 
CGTTGGGGGC ATGGTGGGAC AGCTCTTCGG AGGCCAGCTG GGACAGATCA CAGGCCTGGA 600 
20 CCCCGCTGGA CCTG AGT AC A CCAGGGCCAG TGTGGAAGAG CGCTTGGATG CTGGAGATGC 660 
CCTCTTCGTG GAAGCCATCC ACACAGACAC CGACAATTTG GGTATTCGGA TTCCCGTTGG 720 
ACATGTGG AC TACTTCGTCA ACGGAGGCCA AGACCAACCT GGCTGCCCCA CCTTCTTTTA 780 
CGCAGGTTAT AGTTATCTGA TCTGTGATCA CATGAGGGCT GTGCACCTCT ACATCAGCGC 840 
CCTGG AG AAT TCCTGTCCAC TG ATGGCCTT TCCCTGTGCC AGCTAC AAGG CCTTCCTTGC 900 
25 TGGACGCTGT CTGG ATTGCT TTAACCCTTT TCTGCTTTCC TGCCCAAGGA TAGGACTGGT 960 

GGAACAAGGT GGTGTCAAGA TAGAGCCGCT CCCCAAGGAA GTGAAAGTCT ACCTCCTGAC 1020 
TACTTCCAGT GCTCCGTACT GCATGCATCA CAGCCTCGTG GAGTTTCACT TGAAGGAACT 1080 
GAGAAACAAG GACACCAACA TCGAGGTTAC CTTCCTTAGC AGTAACATCA CCTCTTCATC 1 140 
TAAGATCACC ATACCTAAGC AGCAACGCTA TGGGAAAGG A ATCATAGCCC ATGCCACCCC 1200 
30 AC AATGCCAG ATA A ACCAAG TG AAATTCAA GTTTCAGTCT TCCAACCGAG TTTGG A AA AA 1 260 
AGACCGGACT ACCATTATTG GGAAGTTCTG CACTGCCCTT TTGCCTGTCA ATGACAGAGA 1320 
AAAGATGGTC TGCTTACCTG AACCAGTGAA CTTACAAGCA AGTGTGACTG TTTCCTGTGA 1380 
CCTGAAGATA GCCTGTGTGT_AGTTTAACCT GGGCAGGACA CATCTCCCTG CATTTTTTTT 1440 
TTTTTTTTTT GAGAG AG AGG TGTGATGAGG GATGTGTGTG TGCAGCTTAT TGTAGACCAT 1500 
3 5 TACTACTAAG GAGAAAAGCA A AGCTCTTTC TTATTTTCCT CATAATCAGC TACCCTGGAG 1560 
GGGAGGGAGA ACTCATTTTA CAGAACTTGG TTTCCTTTGC CGATCTTATG TACATACCCA 1620 
TTTTAGCTTT CCCATGCATA CTTAACTGCA CTTGCTTTAT CTCCTTGGGC ATTCGTACTT 1680 
AGGATTCAAT AGAAACATGT ACAGGGTAAA CAATTTTTTA AAAATAAAAC TTCATGGAGT 1740 
AAAAAAAAAA AAAAAAAA 



40 



45 



SEQ ID NO: 128 PFH8 Protein sequence: 
Protein Accession #: NP.056984.1 



1 U 21 31 41 51 
I I I I I I 

MPPGPWESCF WVGGLILWLS VGSSGDAPPT PQPKCADFQS ANLFEGTDLK VQFLLFVPSN 60 
PSCGQLVEGS SDLQNSGFNA TLGTKLIIHG FRVLGTKPSW IDTFIRTLLR ATNANVIAVD 120 
WIYGSTG VYF SAVKNVIKLS LEISLFLNKL LVLGVSESSI HI1GVSLGAH VGGM VGQLFG 1 80 

50 GQLGQITGLD PAGPEYTRAS VEERLDAGDA LFVEAIHTDT DNLGIRIPVG HVDYFVNGGQ 240 
DQPGCPTFFY AGYSYLICDH MRAVHLYISA LENSCPLMAF PCASYKAFLA GRCLDCFNPF 300 
LL5CPRIGLV EQGGVKIEPL PKEVKVYLLT TSSAPYCMHH SLVEFHLKEL RNKDTNIEVT 360 
FLSSNITSSS KITIPKQQRY G KG II AH ATP QCQINQVKFK FQSSNRVWKK DRTTIIGKFC 420 

^ ^ TALLPVNDRE KMVCLPEPVN LQAS VTVSCD LK1ACV 

SEQ ID NO:129 PFH7 DNA SEQUENCE 

Nucleic Acid Accession*: NM_0 14334 
60 Coding sequence: 89-1 336 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
« J I I I I I 

OD CGTTGCCGGG TCGCAGGTCC CGCCAGTGCG AGCGCAACGG AGGTCGAAGG CGTTCAGACT 60 
CTTAGCTGAA CGCGGAGCTG CGGCGGCTATGCTGTGGAGC GGCTGCCGGC GTTTCGGGGC 120 
GCGCCTCGGC TGCCTGCCCG GCGGTCTCCG GGTCCTCGTC CAGACCGGCC ACCGGAGCTT 180 
GACCTCCTGC ATCG ACCCTT CCATGGG ACT TA ATGAAGAG CAGAAAGAAT TTCAAAAAGT 240 
GGCCTTTGAC TTTGCTGCCC G AG AG ATGGC TCCAAATATG GCAGAGTGGG ACCAGAAGGA 300 

70 GCTGTTCCCA GTGGATGTG A TGCGGAAGGC AGCCCAGCTA GGCTTCGGAG GGGTCTACAT 360 
ACAAACAG AT GTGGGCGGGT CTGGGCTGTC ACGTCTTGAT ACCTCTGTCA TTTTTGAAGC 420 
CTTGGCTACA GGCTGCACCA GCACCACAGC CTATATAAGC ATCCACAACA TGTGTGCCTG 480 
GATGATTGAT AGCTTCGGA A ATGAGGAACA G AGGCACAAA TTTTGCCC AC CGCTCTGTAC 540 
CATGG AG AAG TTTGCTTCCT ACTGCCTCAC TG AACCAGG A AGTGGGAGTG ATGCTGCCTC 600 

75 TCTTCTGACC TCCGCTAAGA AACAGGGAGA TCATTACATC CTCAATGGCT CCAAGGCCTT 660 
CATCAGTGGT GCTGGTG AGT CAGACATCTA TGTGGTCATG TGCCGAACAG G AGG ACC AGG 720 
CCCCAAGGGC ATCTCATGCA TAGTTGTTGA GAAGGGGACC CCTGGCCTCA GCTTTGGCAA 780 
GAAGGAGAAA AAGGTGGGGT GGAACTCCCA GCCAACACGA GCTGTG ATCT TCGAAGACTG 840 
TGCTGTCCCT GTGGCCAACA G AATTGGGAG CGAGGGGCAG GGCTTCCTCA TTGCCGTGAG 900 
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AGGACTGAAC GGAGGGAGGA TCAATATTGC TTCCTGCTCC CTGGGGGCTG CCCACGCCTC 960 
TGTCATCCTC ACCCG AGACC ACCTCAATGT CCGGAAGCAG TTTGGAGAGC CTCTGGCCAG 1020 
TAACCAGTAC TTGCAATTCA CACTGGCTGA TATGGCAACA AGGCTGGTGG CCGCGCGGCT 1080 
GATGGTCCGC AATGCAGCAG TGGCTCTGCA GGAGGAGAGG AAGGATGCAG TGGCCTTGTG 1 140 
5 CTCCATGGCC AAGCTCTTTG CTACAGATGA ATGCTTTGCC ATCTGCAACC AGGCCTTGCA 1200 
GATGCACGGG GGCTACGGCT ACCTGAAGGA TTACGCTGTT CAGCAGTACG TGCGGGACTC 1260 
CAGGGTCCAC CAGATTCTAG AAGGTAGCAA TGAAGTGATG AGGATACTGA TCTCTAGAAG 1320 
CCTGCTTCAG GAG TAG AACC CACACTTGTT CTGGCCTGGT GTTCAGTGCG ACTGCAGTCA 1380 
GTGTTGAGTG GTGCCATGTG GGCCGCTCTA TTCCAAAGGA ATCATGGATT AGACCCAAGG 1440 

1 0 GCTG AGCTCC TCTAGGGCAG G ACCTGC ACC CTGTGTGTTG GC ACC AGCAT CGGGTCTTGG 1 500 

ACTGGGGCAG AATCCCCAGT GGAACCGGAA GAGCTGGACT GATGAGAAAC ATCAGAAGAA 1560 
CACATACTAC CTTGTTTTCC TAATGCCAGA AGGGTGACCA GTGAAGATTC ACCGTCAAAC 1620 
CATGAAAGTC CTTTCTTGGA TCCACTTTAT CTTGATTAGT CTGCATTTTA CTAGTTCACT 1680 
GG ATCCCTCC TCTAGGGGCC TGGGG ACTTT CACTGATGCT CTTCCTGATT CTAGAGCAAA 1 740 

1 5 GGTGTGGGAA GGGGAAATGG AGGAATGCCC TCCTGTCTGT GTCGTTCTCT GTGCCACAGC 1800 
TACAGATGCA GAAGGTTTCT CTGGATAGCA CACCTCTGAA TGTAAATCAT GATAAAATGG I860 
ATATTTGGAA ACTTACTCCT AAGCTGTG AT GTAGGGTGTA TTTCTACTTC TGGACTGCCT 1920 
CAATATCAAG GGCTGAGACT TTTGAATGTT G AATATTCGT TGGGTTTCAT GTTAAGACGC 1980 
CTGTGGTCC A GGAGTGCTAT TCAGTGTTTC TGTTCCTGAT AAACACTTTG AATATTTTTT 2040 

20 TGTGTTTTTG TTTCCTTTTC TGAAGCTGTT CCTCCTTTTA AATATTTTTA ATCACATTGA 2100 
TAAAATCTAT CCTTCATCCA CCTCTGGTTC TACTATAGTT GATTTTTATT TTAAATGTTT 2160 
AATTGTATTT GATTAAACAC TTAACTGG AT TTTGGAATAA TAAAACTCTC GTCCAATTTG 2220 
GCTTTTAAAA AAAAAAAA 

25 

$EQ ID N0:130 PFH7 Prolan sequence: 
Protein Accession #: NP_0551 99. 1 



30 1 11 21 31 41 51 
i I I I I I 

MLWSGCRRFG ARLGCLPGGL RVLVQTGHRS LTSCIDPSMG LNEEQKEFQK VAFDFAAREM 60 
APNM AEWDQK ELFPVDVMRK AAQLGFGG V Y IQTDVGGSGL SRLDTS VIFE ALATGCTSTT 1 20 
AYISIHNMCA WMIDSFGNEE QRHKFCPPLC TMEKFASYCL TEPGSGSDAA SLLTSAKKQG 180 
35 DHYILNGSKA FISGAGESDI YVVMCRTGGP GPKGISCIVV EKGTPGLSFG KKEKKVGWNS 240 
QPTRAVIFED CAVPVANRIG SEGQGFLIAV RGLNGGRINI ASCSLGAAHA SVELTRDHLN 300 
VRKQFGEPLA SNQYLQFTLA DMATRLVA AR LMVRN AA VAL QEERKDAVAL CSM AKLFATD 360 
ECFAICNQAL QMHGGYGYLK DYAVQQYVRD SRVHQILEGS NEVMRILISR SLLQE 



40 



45 



55 



65 



SEQ ID N0:131 PFH6 DNA SEQUENCE 

Nucleic Acid Accession #: NM_01 3989 

Coding sequence: 707-1 1 05{underiined sequences correspond to start and stop codons) 



1 11 21 31 41 51 
I I 1 I ! t 

GCCTGCAGAG AGAGGCACTT TGCACCACAG ACAGATAGCA AGAAGGGAAA GACAGAGAGT 60 
GAGAAAAAAG AGGAGTCAGT CGCTCCTGGG GAAGGGAGAG AGTG AGACTG GGAGAAAGAG 120 
50 AAGCACAGAA AGTGTGTGTA AAACGGAGTA AAGAAAGAAA AAAAAAAAAC TACCCTTAAA 180 
GCACATTTAA AAAAAAAAAA CTCTGGCAAT TCAAGAAAGA AACAGGCTAC GTTTAAAGAG 240 
CATAG AG ACA ATG AAAGGCT AA AG AAAATT TTAAAATCTC TGCCAC AGTC TCATAGGTGC 300 
TTGGAAATGA AAGTAGAACT GCCTGTCTTT AACGGACTCT GACAGAGGTA ACTGGATTAG 360 
GGACGAGTAC GCC AGCTTTT TTTTTTTTTT TTTTTTTTTT TTTAACATCT TAAATCCTGA 420 
AAAAAAAAAA AAAAAAAAAA AAAAGGCAGC AGCTCCGAAT TGAATGAATT GATGGGCACA 480 
CTCCAACTGC TGGGCTGGAG AGACTGGACT TAGTCTTGCC ATTTCTGCTT CTTTG AAAG A 540 
GGAGACAACT TGGGCTTCCT TTTAATTTAG TTTTTTTTCC CCTTCTCCCC CAACCCCCAA 600 
CCTTCCCCCT TACCTCCCCC ACCCCCTTTA TCACCACCCC CCTTTTAA AT AAGAGGGTGA 660 
AGGGGAACCA G AGCGCAC A A GGG A ACTG AC TC AGG AGGC A GAGAAGATQG GC ATCCTCAG 720 
60 CGTAGACTTG CTGATCACAC TGCAAATTCT GCCAGTTTTT TTCTCCAACT GCCTCTTCCT 780 
GGCTCTCTAT G ACTCGGTCA TTCTGCTCAA GCACGTGGTG CTGCTGTTGA GCCGCTCCA A 840 
GTCC ACTCGC GG AG AGTGGC GGCGC ATGCT GACCTCAGAG GGACTGCGCT GCGTCTGGAA 900 
GAGCTTCCTC CTCGATGCCT AC A A AC AGGT G A A ATTGGGT GAGGATGCCC CCAATTCCAG 960 
TGTGGTGCAT GTCTCCAGTA CAGAAGGAGG TGACAACAGT GGCAATGGTA CCCAGGAGAA 1020 
GATAGCTGAG GGAGCCACAT GCCACCTTCT TGACTTTGCC AGCCCTGAGC GCCCACTAGT 1080 
GGTCAACTTT GGCTCAGCCA C TTGA CCTCC TTTCACGAGC CAGCTGCCAG CCTTCCGCAA 1 140 
ACTGGTGGAA GAGTTCTCCT CAGTGGCTGA CTTCCTGCTG GTCTACATTG ATGAGGCTCA 1200 
TCCATCAGAT GGCTGGGCGA TACCGGGGGA CTCCTCTTTG TCTTTTGAGG TGAAGAAGCA 1260 
CCAGAACCAG GAAGATCG AT GTGCAGCAGC CCAGCAGCTT CTGGAGCGTT TCTCCTTGCC 1320 
70 GCCCCAGTGC CG AGTTGTGG CTG ACCGCAT GG AC A ATAAC GCCAACATAG CTTACGGGGT 1 380 
AGCCTTTGAA CGTGTGTGCA TTGTGCAGAG ACAGAAAATT GCTTATCTGG GAGGAAAGGG 1440 
CCCCTTCTCC TACAACCTTC AAGAAGTCCG GCATTGGCTG GAGAAGAATT TCAGCAAGAG 1500 
ATGAAAGAAA ACTAGATTAG CTGGTTAAAG GTATGATTAT AAGAGAGCTT ATTGTTTTAA 1560 
AAAGTTATAT A A AGGC A AGG AAATTAAGAA CTG AATCCAT ATTTCA ACAG AGCCCTATTG 1620 
75 GCTTACTGAA AG AC A GG AGT TTATCTATCG GAAGAACATG AATCTCTAAC AGCTCCATAC 1680 
TTCTTTCACT ACTCAAATGG CATTGGGCTG AGTAAGTAAC CATATCACCT CTCTTCTTAG 1740 
TAAAAAGCCC TATGTGAAAA GATCCCAAGA TGG AG AGG AA GAAACGCTAA TTCAGCATGT 1800 
GTTCATTCTG CATTGAGAAG G A ACTG AT AC ATCTGATGCA TGCTTTGAGA CCAGAAGAAA 1860 
AGACTTACCT GAATAATTAC TACATTAGGG AAGCTACTGT CTACGTTAAG ATAAAGGGTA 1920 
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TTGCCTTGGC TCTATTTGGC ATGGATGGAG CCCAGTTGGA AAATTCCCAA ATATTACAAC 1980 
AAGTCCTTGA ACCCAGGCCA TGTGGTTAGA CGTTGGTGTT AAGGTTAGAC CTTATGTTAG 2040 
AGTCATTTCT G ATGTTCC AG CTTCTAGCCA TGTAGTGCTC TCAGTCTTCA TACCCCAGAA 2100 
ATTATTGGTA TATTTGTAGA TACCGAGAAT GATCCCTCAG TCTGAGAGGT TAGAATGATC 2160 
ATCTGTAATC TG AGGGTTAA TTTCTAGGCA GGTGGAGAGA GTGGTAAAAA AGAAATGAAA 2220 
TTGACAAGCT AGGAAAGAGG AGGCAGAAAG ATTTGGAAAA TTCACAGAGT TTCACCCTTA 2280 
AGCTGTAGAG AGTGGGTCAC ATTTGTTAGC CACGGAAACA TAGAAACATA CACAAGGCCA 2340 
GAAAAAGAAG AAGGAGCTCA ACTAAAAGTG GCATAGAGAA TACACATATA AAAACAATAT 2400 
ATTTGTCATA TGCTCCTAGA GAGGAGAAAG GGGTGATTGA AAGAAAAAAA AATACTTAAA 2460 
TATTTGTAAT TGTGAGGGGT TTCTTTTGGA AATAATTACT TTTGAACCAT GTATGTGGTA 2520 
TGTATATTTT CAGTGGGTTA ATTATACCCC ATGATACCTA TTAAAGGAAA ACCAGTGGGT 2580 
CTGGTGGTGC TGGTCTTTTC CTCCCCATTC CTACAATTTC TATGTGGCCC AAGTCATTCC 2640 
TAATCTTGGT CTCTATAGCA GTGTTCTCTC TGAATGCTGA GCTGAAGAAA TTATACGTAC 2700 
ATACACACAT AC ATACATAC ATACAAATAT ATGTATATAT ATTCTCAGCT GCTGCGGGAG 2760 
GTAGGTACCA TGGCCATTCA GCACAGCCTT GATTTCCTCC CAAAGTAGGT GAGCTATAGT 2820 
GAAGAATAGG TGCAAACAAA CAAGCTTACT TCCATTGCAA AATAGAAGAA GAGGAAGTTA 2880 
GAGATAATTC TGATCAATCA TTTTGGAGGC TTTGTTATAA GGCAACCCCC GGTATATCAT 2940 
GGAATTTCCA TTGACATTTG AATTTGGACT TGGATCTTCC CTTGGTCCCA TTAGCTGAGG 3000 
TTTAGTAATC TAAAGTCCCT ATAGTATATG ATTATAATGC TATTTTAAAA AATATATATA 3060 
TAAAATATTT TTTTCTTTTT AAAATAGACA CTATAGTTTT ACCCATAAGT AATATTTAAA 3120 
GATTATAGCT CCCAAAAGAA TGG ACCAACC ACTTTCGTAT CATAATTTCT TTTTGGTAAA 3 1 80 
TATGAGACTA TTATGAAATC ATAGTATATG ATTGTATTTA AAGGTACAAT CAAAGGATCT 3240 
TTTGTCCATT CCATTAATAA CTGAATAAAA AATAAATAAA ATGG AT AG AA AAAAACTAAA 3300 
GTTGAAAATA CATTCTTAAA CTAGTTGTCT GAAATGAGAA AAG AGTGAGA ACTAGGTGTG 3360 
CAAGAACCAA ACGTATTTTA TTTTATTTTT TAAATGGGAG CAACATATCA GTCGTGTCAC 3420 
CAGCTGGTAT ATTGTGTAAA TATTAAAGCT CCATTGGG AC TG ATTTTTCA TGGCAACATC 3480 
AGCTTTCTAA TGTTCTAAAT TCTATAAAAA CCACCCACAA AGAAACAAAG CAAATTTCAT 3540 
TATCTAATGA GTTGCTGGAA AATCATATTG AGAATAATTA TTTCAGATTC CTCAGTTGTT 3600 
AACTTCTACA TTCAAGGGCT TATCTCTGCC CCCATTG ATT TTTA ACCTCA AAATGGTGTG 3660 
AGATTTACTG TGGAACCCTA AAGCAGTAAA ATAAAAAACC TGGTTGCAGC ACATTCACAC 3720 
TGTTGTCCTT AAAATTCCCC TTTTTTCTCT ATGTACGATA AAGTAACAGT ATGTCAGATA 3780 
AGCCGGTGGG GGG ATG AGAT TAGGCTGAGG CAGTGCTAGT CAACTGGGGG AAAAGGATGA 3840 
TGGAAAAATC ACCCAGTTGT GCTATATTTT TAAAGAAGGA GGTCGTTTAT GTGTGCAGAC 3900 
AATTCTCCCT GAGGTTAGCC CAATGGAGAA ATGAAGCAGA GGAAGGAAAC ATAGAAAGAC 3960 
ATGGGCTATC AGGGAGGAAG ATGTTCAATA GAACATGCAA GAATTTCTGG AAGAAAGGCT 4020 
GTGGAAGGGC CAATGGAGAA AATGAATGGA CAAAGCTCAG GAATCCCTAC GCTATGTAGA 4080 
ATGTTCTTGG TGTTATCAGG GTTAAGCCCT GTAATTATGT AACCTATTTA TCGCAACATG 4140 
AATTTTTATG ATTTCTTGTG ATGTATTCTT TTATGAAATT AACA AG AACT CATTATTTTG 4200 
AGGTAGAGGA AAATC AATGC TTTATCTGAT ATGCTGAGAA ATTATTAGAT TGCCAATACT 4260 
CATGTGCGTT TCATGTGTTT TATAAGGTTT GTTCCTTTG A AGAATTGTAG TTCTTAGTCC 4320 
CACAGGGAAA TGTGTATCTA TTTATATATC ATAGTATAAA TCTATGATAT ATTTATATCA 4380 
TATATAAAAG TCTGAGTTCT CTTTCTTAGT CCCTAATCAT GTTTCTCCC A TAGGCTGTGT 4440 
TTACATGGAG CTATCGGTTT AGCCTTTTAA GCTTCATTAG CTTGTCTATT ATTGAAATAG 4500 
TTTCCAAGAA ATTTTAGATA TTATCATAAC ATCTGGGTCT ACTCAAACAC TTATTGTTTG 4560 
AAAGACTTAT GTCTTGGACC TATCAAAAAC TGACTTTATT TATTGCTTAG TGAAAATACT 4620 
AGTGGG ATCA ACAATGATTT TCTTGAATGG GCATG AATGG AGATGCCCGC ACAGTAATGT 4680 
AGAAATGTTT CATACAGCTA TTAAAATGTA ACTG ACCTCC TTAGAGGCAG ATTAGTAACT 4740 
GTTCCTACTT TGTATAGCTA AGTGACAGTC ACTTAACTTA CATGACTTTC TTTTTTCACA 4800 
TTGGGTCTCT GGTCCTGTGT CTTCACCTC A TTTATAGCAC GTCTCCTTGA TTTTTGGTAG 4860 
TATCAACTTC CCAGTGATCT GTTCAGTTAA GTTCTTCTCC CGTTAACCAG GAAGTGCTTA 4920 
TTCTCTCATC ACAGTGGG AA GAATAGCCTA TTGTCTTTCA TTTTGCCTGA GTGTATTTTA 4980 
CTATTTGGGC TCTGAAATAA AAATTATGAA ATATGGTGAG GTCACATGTT GGTGCTGCCT 5040 
TGCTGCATAA AATTCTAGGA GGGCAGGTTA GGAGACAGTT ATGTATGGCC TTTCGGGAAA 5100 
ATTCAAAGGG TGGGATTACA AGGGTGTTCC TCAGGCATGC CCCTATGGGC CCTATGTGGA 5160 
AGCAAGAAGA ATTGACTGAT TTACAGGACT TCTCTTTATG TCAATCTTAA GAGGATGGAT 5220 
GAATCTGGAC ATTTGTTCCA CCCGACCTCT GACTGATGGT TTGGAAAATA ACTTTAATTA 5280 
GGATCATATG ACCATTGAAA AAGGAAAAAT GTAGACTCTG ACTTCCGTCC CACTGAAGGA 5340 
TTAATGAAAA CCTTTACTAG CATTTAGAGC TTTTCAGAAC ATCCCCACTG TCATGTGTCT 5400 
CAGCAGTGGA GACTGCAAGT AAGGCTTTTA ATTTTAGGAG GTTTTTTTTT TTTTTTTTTT 5460 
TTCCCCTAAA TGGTATGGCC AAAAGTCAGA GTTAAAATAT ATATAGTTAG ATTCCAACTT 5520 
CCTCCTTCAC TCTAAAAATA GAATCCAAAC CCACTCTTCA TATATGCTTC CAGAATGGGG 5580 
CTTAAGTACC AATCTCTGCT TTGCAATGGG CACAATCTTG GTCATGTCCT GAGGCTCTCT 5640 
AAGAAAAGAG AGGATCTAGG ATGGGAGAGC TAGAAAGTTG CTAACTGGGA AGAACAAGGC 5700 
CCTGAGGGGT TGGTCTACCA ATCTGGGAAG ATTTGAAAAC AAACTTCTCG CAACTGAAGG 5760 
AAGGCTGAAG GCTGCTGCAA GTCATTGAGT GACTTTAGGA TGAGCAAAAC ATTGGGCCAC 5820 
TTCCTAATGC CCTATGTGTA TAGTACCAGA AGCAAGGTCT CAGACTTAAC AGACCCAGCT 5880 
CTGTTCCAAG GTGAGTCTGA ACCAATAGAA AGCAAACATG TGCAGATATC CAAACAAGAC 5940 
TGCTCATGCA AGTCGGGGCT GGCTACCCGT CTTAGGCAGC AACAGCAGAG CTCCAGGGAG 6000 
CTTATTCAAT ATTTACTGAG ACTTCGAAGA CCCAGCAG AT GTTTAATG AA GTCACTATTT 6060 
TGGCTCAAAC CCTCCACTTC TCCCCCTCCC CTCAAAAAGC CAACAGGTAA ACACATAAAT 6120 
GAAAGAAACC CACAGAAGGG GATGGGAAAT AAAGAAAATT CTCTCAAGAC TTCTCCAGGC 6180 
CCATGTCACT GGTCAGCGTG GTTTTTATGT GTATTAGGAT TGGGGGATGT GAAGAAATAA 6240 
GTATCCAGTA CTTTATAACC AAAGCAATTA AATGATATTG GGGTAGGGAA TGTTGGCCAG 6300 
TTTTGTTTAG TTTTGCC ATC ACATTGTCAC CCAGACCTCA CCTAGCCCCA AGTAATCGGG 6360 
CGCCCCGAAG AGGGAGACAG AG ATGTGCC A GAGTTGACCC AGTGTGCGGA TGATAACTAC 6420 
TGACGAAAGA GTCATCGACC TCAGTTAGTG GTTGGATGTA GTCACATTAG TTTGCCTCTC 6480 
CCCATCTTTG TCTCCCTGGC AAGGAGAATA TGCGGGACAT GATGCTAAGA GCCCTGGGTA 6540 
AATGTGGTGA GAATGCACGC GTGCATATGC TACACATATG TGCTTCTCAG TTGCAGAAAA 6600 
TGAACTGCTT TGGGAGATTA TCAGTAGAAA GAGTGTTATC ATATTGGTGC TGAGTGCTAT 6660 
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GTGTGCTTAT ACAATTTGTT CTTGTATTTT AATAAACTTT GAATAAAAGA ATAAAAAAAA 6720 
AAAAAAAAAA AAAAA 



SEQ ID NO:132 PFH6 Prolan sequence: 
Protein Accession #: NP_054644. 1 

1 11 21 31 41 51 
i I I I I I 

MG1LSVDLLI TLQILPVFFS NCLFLALYDS VILLKHVVLL LSRSKSTRGE WRRMLTSEGL 60 
RCVWKSFLLD AYKQVKLGED APNSSVVHVS STEGGDNSGN GTQEKIAEGA TCHLLDFASP 120 
ERPLVVNFGS ATXPPFTSQL PAFRKLVEEF SSVADFLLVY IDEAHPSDGW AIPGDSSLSF 180 
EV KKHQNQED RCAAAQQLLE RFSLPPQCR V VADRMDNNAN IA YGVAFER V CIVQRQKIAY 240 
LGGKGPFS YN LQEVRHWLEK NFSKRXKKTR LAG 



SEQ ID NO:133 PFH5 DNA SEQUENCE 

Nucleic Acid Accession #: NM_001 141 

Coding sequence: 72-2 1 02 (undeili ned sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I I I I I 

CAGGCGTGTC CCAGGGGGAG CCCCGCTCTG CAGCCCTGTG CGCCGTAGAG AGCTGGACTT 60 
AGGCTGGCAG CATGGCCGAG TTCAGGGTCA GGGTGTCCAC CGGAGAAGCC 7TCGGGGCTG 120 
GCACATGGG A CAAAGTGTCT GTCAGC ATCG TGGGGACCCG GGG AGAGAGC CCCCCACTGC 1 80 
CCCTGGACAA TCTCGGCAAG GAGTTCACTG CGGGCGCTGA GGAGGACTTC CAGGTGACGC 240 
TCCCGGAGGA CGTAGGCCGA GTGCTGCTGC TGCGCGTGCA CAAGGCGCCC CCAGTGCTGC 300 
CCCTGCTGGG GCCCCTGGCC CCGGATGCCT GGTTCTGCCG CTGGTTCCAG CTG ACACCGC 360 
CGCGGGGCGG CCACCTCCTC TTCCCCTGCT ACCAGTGGCT GGAGGGGGCG GGGACCCTGG 420 
TGCTGCAGGA GGGTACAGCC AAGGTGTCCT GGGCAGACCA CCACCCTGTG CTCCAGCAAC 480 
AGCGCCAGGA GG AGCTTC AG GCCCGGC AGG AGATGTACC A GTGGA AGGCT TAC AACCCAG 540 
GTTGGCCTCA CTGCCTGG AT G AAAAGACAG TGG A AG ACTT GGAGCTCAAT ATCAAATACT 600 
CCACAGCCAA GAATGCCAAC TTTTATCTAC AAGCTGGCTC TGCTTTTGCA GAGATGAAAA 660 
TCAAGGGGTT GCTGGACCGC AAGGGGCTCT GGAGGAGTCT G A ATGAG ATG AAAAGG ATCT 720 
TCAACTTCCG GAGGACCCCA GCAGCTGAGC ACGCATTTGA GCACTGGCAG GAGGATGCCT 780 
TCTTCGCCTC CCAGTTCCTG AATGGTCTCA ACCCTGTCCT GATCCGCCGC TGTCACTACC 840 
TCCCAAAGAA CTTCCCCGTC ACTG ATGCCA TGGTGGCCTC ATTGTTGGGT CCTGGGACCA 900 
GCTTGCAGGC TG AGCTAG AG AAGGGCTCCC TGTTCTTGGT GG ATCACGGC ATCCTCTCTG 960 
GCATCCAGAC CAATGTCATT AATGGGAAGC CGCAGTTCTC TGCGGCCCCA ATGACCCTGC 1020 
TATACCAGAG CCCAGGCTGC GGGCCGCTGC TGCCTCTCGC CATCCAGCTC AGCCAGACCC 1080 
CCGGCCCAAA CAGCCCCATC TTCCTGCCCA CTGATGACAA GTGGGACTGG TTGCTGGCCA 1140 
AGACCTGGGT GCGCAATGCC GAGTTCTCCT TCCATGAGGC CCTCACGCAC CTGCTGCACT 1200 
CACATCTGCT GCCTGAGGTC TTCACCCTGG CTACCCTGCG TCAGCTGCCC CACTGCCACC 1260 
CTCTCTTCAA GCTGCTGATC CCGCACACCC G ATAC ACCCT GC ACATCAAC ACACTCGCCC 1320 
GGGAGCTGCT TATCGTGCCA GGGCAGGTGG TGG ACAGGTC CACAGGCATC GGCATTGAAG 1380 
GCTTCTCTGA GTTGATACAG AGGAACATGA AGCAGCTGAA CTATTCTCTC CTGTGTCTGC 1440 
CTG AGG ATAT CCGGACCCG A GGAGTTGAAG ACATCCCAGG CTACTACTAC CGTGATGATG 1500 
GG ATGCAG AT TTGGGGTGCA GTGGAACGCT TTGTCTCTGA AATCATCGGT ATCTACTACC 1 560 
CAAGTGATGA GTCTGTCCAA GATGACAGAG AGCTCCAGGC CTGGGTCAGA GAGATCTTCT 1620 
CCAAGGGCTT CCTAAACCAG GAGAGCTCAG GTATCCCTTC CTCACTGGAG ACCCGGGAAG 1680 
CCCTGGTGCA GTATGTCACC ATGGTGATAT TCACCTGCTC AGCCAAGCAT GCGGCTGTCA 1740 
GTGCAGGGCA GTTTGACTCC TGTGCTTGGA TGCCCAACCT GCCACCCAGC ATGCAGCTGC 1800 
CACCACCCAC CTCCAAAGGC CTGGCAACAT GCGAGGGCTT CATAGCCACC CTCCCACCTG I860 
TCAATGCCAC ATGTGATGTC ATCCTTGCTC TCTGGTTGCT GAGCAAGGAG CCTGGAGACC 1920 
AAAGGCCCCT GGGCACCTAT CCGGATGAGC ACTTCACAGA GGAGGCCCCT CGGCGG AGCA 1980 
TCGCCACCTT CCAGAGCCGC CTGGCCCAGA TCTCGAGGGG CATCCAGGAG CGGAACCGGG 2040 
GCCTGGTGCT GCCCTACACC TACCTAGACC CTCCCCTCAT CGAGAACAGC GTCTCCATCI 2100 
AAATCCCAGG GG AACACAGG CCCAGATGAC ATCCCTTTGA CCACATCGCT CTAGGATAAC 2160 
TGGCACCCAG AGAAAAGGAC TCCTCAGAAA AAACAGGCCC CCATGTGCCT CTCCTGGGAC 2220 
AACCAGACTC TGTAACTCAC CCCCACCACC ATACACACAC ACAAAAACAG AAACAAAATC 2280 
AAAACAGAGA AAGCAGAAAA TCTACCAAGA ACAGAGTCTC AGGACAGAAC CACTGAGTCT 2340 
TTTGGAGGCT CCAAGCCTCA AAGTGCCCGC AGAGCCCACC TTGAGGGTTT TGCTAGTTGG 2400 
TTTTGTTTTG CGTTTAC AGC CGTGGGGGG A AGCACATAAT CCCGCCCCAG GGCCCACTAG 2460 
CATCCACTGA TTGGACCTTA TGGTCACCCA ACTCAAGGAC AGCCACCAAG AAGTGGCTGC 2520 
CAAAGAGACT GGGCGCAGTG GCTCATGCCC ATAATCCCAG CACTTTGGGA GATGGAGGCG 2580 
GGAAAATCAT TTGAGGTCAG AAGTTCAAGG CCAGCCTGGA CGACATAGCG AGACTCCACC 2640 
TCTACCAAAA AATAAAAATT AAAAAACAAA AAAAAAAAAA AAAAA 



SEQ ID NO:134 PFH5 Protein sequence: 
Protein Accession #: NP.001 132. 1 

1 II 21 31 41 51 
i I I I I I 

MAEFRVRVST GEAFGAGTWD KVSVSIVGTR GESPPLPLDN LGKEFTAGAE EDFQVTLPED 60 
VGRVLLLRVH KAPPVLPLLG PLAPDAWFCR WFQLTPPRGG HLLFPCYQWL EGAGTLVLQE 120 
GTAKVSWADH HPVLQQQRQE ELQARQEMYQ WKAYNPGWPH CLDEKTVEDL ELNIKYSTAK 1 80 
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NANFYLQAGS AFAEMKIKGL LDRKGLWRSL NEMKRIFNFR RTPAAEHAFE HWQEDAFFAS 240 
QFLNGLNPVL IRRCHYLPKN FPVTDAMVAS LLGPGTSLQA ELEKGSLFLV DHGILSGIQT 300 
NVINGKPQFS AAPMTLLYQS PGCGPLLPLA IQLSQTPGPN SPIFLPTDDK WDWLLAKTWV 360 
RNAEFSFHEA LTHLLHSHLL PEVFTLATLR QLPHCHPLFK LLIPHTRYTL HINTLARELL 420 
IVPGQVVDRS TGIGIEGFSE UQRNMKQLN YSLLCLPEDI RTRGVEDIPG YYYRDEX3MQI 480 
WGAVERFVSE flGIYYPSDE S VQDDRELQA WVREIFSKGF LNQESSGIPS SLETREALVQ 540 
YVTMVIFTCS AKHAAVSAGQ FDSCAWMPNL PPSMQLPPPT SKGLATCEGF IATLPPVNAT 600 
CDVILALWLL SKEPGDQRPL GTY PDEHFTE EAPRRSIATF QSRLAQISRG IQERNRGLVL 660 
PYTYLDPPLI ENSVSI 



SEQ ID NO: 135 PFH4 DNA SEQUENCE 

Nucleic Acid Accession #: NM_002742 

Coding sequence: 236-2974 (underlined sequences correspond to start and stop codoos) 



I 11 21 31 41 51 
I I I I I I 

GAATTCCTTC TCTCCTCCTC CTCGCCCTTC TCCTCGCCCT CCTCCTCCTC CTCGCCCTCC 60 
CCTCCCGATC CTCATCCCCT TGCCCTCCCC CAGCCCAGGG ACTTTTCCGG AAAGTTTTTA 120 
TTTTCCGTCT GGGCTCTCGG AG AA AG AAGC TCCTGGCTC A GCGGCTGCAA A ACTTTCCTG 1 80 
CTGCCGCGCC GCCAGCCCCC GCCCTCCGCT GCCCGGCCCT GCGCCCCGCC GAGCGATQAG 240 
CGCCCCTCCG GTCCTGCGGC CGCCCAGTCC GCTGCTGCCC GTGGCGGCGG CAGCTGCCGC 300 
AGCGGCCGCC GCACTGGTCC CAGGGTCCGG GCCCGGGCCC GCGCCGTTCT TGGCTCCTGT 360 
CGCGGCCCCG GTCGGGGGCA TCTCGTTCCA TCTGCAGATC GGCCTGAGCC GTGAGCCGGT 420 
GCTGCTGCTG CAGGACTCGT CCGGGGACTA CAGCCTGGCG CACGTCCGCG AGATGGCTTG 480 
CTCCATTGTC GACCAGAAGT TCCCTG AATG TGGTTTCTAC GGAATGTATG ATAAGATCCT 540 
GCTTTTTCGC C ATG ACCCTA CCTCTGAAAA CATCCTTCAG CTGGTGAAAG CGGCCAGTGA 600 
TATCCAGGA A GGCG ATCTTA TTG AAGTGGT CTTGTCACGT TCCGCCACCT TTGAAG ACTT 660 
TCAGATTCGT CCCCACGCTC TCTTTGTTCA TTCATACAGA GCTCCAGCTT TCTGTGATCA 720 
CTGTGGAGAA ATGCTGTGGG GGCTGGTACG TCAAGGTCTT AAATGTGAAG GGTGTGGTCT 780 
GAATTACCAT AAGAGATGTG CATTTAAAAT ACCCAACAAT TGCAGCGGTG TGAGGCGGAG 840 
AAGGCTCTC A AACGTTTCCC TCACTGGGGT CAGCACC ATC CGCACATCAT CTGCTGAACT 900 
CTCTACAAGT GCCCCTGATG AGCCCCTTCT GCAAAAATCA CCATCAG AGT CGTTTATTGG 960 
TCGAGAG AAG AGGTCAAATT CTCAATCATA CATTGG ACGA CCAATTCACC TTGACAAGAT 1020 
TTTGATGTCT AAAGTTAAAG TGCCGCACAC ATTTGTCATC CACTCCTACA CCCGGCCCAC 1080 
AGTGTGCCAG TACTGCAAGA AGCTTCTGAA GGGGCTTTTC AGGCAGGGCT TGCAGTGCAA 1 140 
AGATTGCAGA TTCAACTGCC ATAAACGTTG TGCACCGAAA GTACCAAACA ACTGCCTTGG 1200 
CGAAGTGACC ATTAATGGAG ATTTGCTTAG CCCTGGGGCA GAGTCTGATG TGGTCATGGA 1260 
AGAAGGGAGT GATGACAATG ATAGTGAAAG GAACAGTGGG CTCATGGATG ATATGGAAGA 1320 
AGCAATGGTC CAAGATGCAG AGATGGCAAT GGCAGAGTGC CAGAACGACA GTGGCGAGAT 1380 
GCAAGATCCA GACCCAGACC ACGAGG ACGC CAACAGAACC ATCAGTCCAT CAACAAGCAA 1440 
CAATATCCCA CTCATG AGGG TAGTGCAGTC TGTCAAACAC ACGAAGAGGA AAAGCAGCAC 1500 
AGTCATGAAA G AAGGATGGA TGGTCCACTA CACCAGCAAG GACACGCTGC GGAAACGGCA 1560 
CTATTGGAGA TTGGATAGCA AATGTATTAC CCTCTTTCAG AATGACACAG GAAGCAGGTA 1620 
CTACAAGGAA ATTCCTTTAT CTG AAATTTT GTCTCTGGAA CCAGTAAAAA CTTCAGCTTT 1680 
AATTCCTAAT GGGGCCAATC CTCATTGTTT CGAAATCACT ACGGCAAATG TAGTGTATTA 1740 
TGTGGGAGAA AATGTGGTCA ATCCTTCCAG CCCATCACCA AATAACAGTG TTCTCACCAG 1800 
TGGCGTTGGT GCAGATGTGG CCAGGATGTG GGAGATAGCC ATCCAGCATG CCCTTATGCC 1860 
CGTCATTCCC AAGGGCTCCT CCGTGGGTAC AGGAACCAAC TTGCACAGAG ATATCTCTGT 1920 
GAGTATTTCA GTATCAAATT GCCAGATTCA AGAAAATGTG GACATCAGCA CAGTATATCA 1980 
GATTTTTCCT GATG AAGTAC TGGGTTCTGG ACAGTTTGGA ATTGTTTATG GAGGAAAACA 2040 
TCGTAAAACA GGAAGAGATG TAGCTATTAA AATCATTGAC AAATTACGAT TTCCAACAAA 2100 
ACAAGAAAGC CAGCTTCGTA ATGAGGTTGC AATTCTACAG AACCTTCATC ACCCTGGTGT 2160 
TGTAAATTTG GAGTGTATGT TTGAGACGCC TGAAAGAGTG TTTGTTGTTA TGGAAAAACT 2220 
CCATGGAGAC ATGCTGGAAA TGATCTTGTC AAGTGAAAAG GGCAGGTTGC CAGAGCACAT 2280 
AACG AAGTTT TTAATTACTC AG ATACTCGT GGCTTTGCGG CACCTTCATT TTAAAAATAT 2340 
CGTTCACTGT G ACCTCAAAC CAGAAAATGT GTTGCTAGCC TCAGCTGATC CTTTTCCTCA 2400 
GGTGAAACTT TGTGATTTTG GTTTTGCCCG G ATCATTGG A GAG AAGTCTT TCCGG AGGTC 2460 
AGTGGTGGGT ACCCCCGCTT ACCTGGCTCC TGAGGTCCTA AGGAACAAGG GCTACAATCG 2520 
CTCTCTAG AC ATGTGGTCTG TTGGGGTCAT CATCTATGTA AGCCTAAGCG GCACATTCCC 2580 
ATTTAATGAA GATGAAGACA TACACGACCA AATTCAGAAT GCAGCTTTCA TGTATCCACC 2640 
AAATCCCTGG AAGGAAATAT CTCATG AAGC CATTGATCTT ATCAACAATT TGCTGCAAGT 2700 
AAAAATGAGA AAGCGCTACA GTGTGGATAA GACCTTGAGC CACCCTTGGC TACAGG ACTA 2760 
TCAGACCTGG TTAGATTTGC GAGAGCTGGA ATGCAAAATC GGGGAGCGCT ACATCACCCA 2820 
TGAAAGTGAT GACCTGAGGT GGGAGAAGTA TGCAGGCGAG CAGCGGCTGC AGTACCCCAC 2880 
ACACCTGATC AATCCAAGTG CTAGCCACAG TGACACTCCT GAGACTGAAG AAACAGAAAT 2940 
GAAAGCCCTC GGTG AGCGTG TCAGCATCCT CTQAGTTCCA TCTCCTATAA TCTGTCAAAA 3000 
CACTGTGGAA CTAATAAATA CATACGGTCA GGTTTAACAT TTGCCTTGCA GAACTGCCAT 3060 
TATTTTCTGT CAGATGAGAA CAAAGCTGTT AAACTGTTAG CACTGTTGAT GTATCTGAGT 3120 
TGCCAAGACA AATCAACAGA AGCATTTGTA TTTTGTGTGA CCAACTGTGT TGTATTAACA 3 180 
AAAGTTCCCT GAAACACGAA ACTTGTTATT GTGAATGATT CATGTTATAT TTAATGCATT 3240 
AAACCTGTCT CCACTGTGCC TTTGCAAATC AGTGTTTTTC TTACTGGAGC TTCATTTTGG 3300 
TAAGAGACAG AATGTATCTG TGAAGTAGTT CTGTTTGGTG TGTCCCATTG GTGTTGTCAT 3360 
TGTAAACAAA CTCTTGAAGA GTCGATTATT TCCAGTGTTC TATGAACAAC TCCAAAACCC 3420 
ATGTGGGAAA AAAATG AATG AGGAGGGTAG GGAATAAAAT CCTAAGACAC AAATGCATG A 3480 
ACAAGTTTTA ATGTATAGTT TTGAATCCTT TGCCTGCCTG GTGTGCCTCA GTATATTTAA 3540 
ACTCAAGACA ATGCACCTAG CTGTGCAAGA CCTAGTGCTC TTAAGCCTAA ATGCCTTAGA 3600 
AATGTAA ACT GCCAT ATATA ACAGATAC AT TTCCCTCTTT CTTATAATAC TCTGTTGTAC 3660 
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TATGGAAAAT CAGCTGCTCA GCAACCTTTC ACCTTTGTGT ATTTTTCAAT AATAAAAAAT 3720 
ATTCTTGTCA AAAAAAAAAA AA 



SEQ ID NO:136 PFH4 Protein sequence: 
Protein Accession #: NP.002733.1 



I U 21 31 41 51 
I I I t I 1 

MSAPPVLRPP SPLLPVAAAA AAAAAALVPG SGPGPAPFLA PVAAPVGGIS FHLQIGLSRE 60 
PVLLLQDSSG DYSLAHVREM ACSIVDQKFP ECGFYGMYDK 1LLFRHDPTS ENILQLVKAA 120 
SDIQEGDLE VVLSRS ATFE DFQIRPHALF VHSYRAPAFC DHCGEMLWGL VRQGLKCEGC 180 
GLNYHKRCAF KIPNNCSG VR RRRLSNVSLT G VSTIRTSSA ELSTSAPDEP LLQKSPSESF 240 
IGREKRSNSQ SYIGRPIHLD KILMSKVKVP HTFVIHSYTR PTVCQYCKKL LKGLFRQGLQ 300 
CKDCRFNCHK RCAPKVPNNC LGEVTINGDL LSPGAESDVV MEEGSDDNDS ERNSGLMDDM 360 
EEAMVQDAEM AMAECQNDSG EMQDPDPDHE DANRTISPST SNNIPLMR VV QSVKHTKRKS 420 
STVMKEGWMV HYTSKDTLRK RHYWRLDSKC ITLFQNDTGS RYYKEIPLSE ILSLEPVKTS 480 
AUPNGANPH CFEITTANVV YYVGENVVNP SSPSPNNSVL TSGVGADVAR MWEIAIQHAL 540 
MPVIPKGSSV GTGTNLHRDI S VSISVSNCQ IQENVDISTV YQIFPDEVLG SGQFGIVYGG 600 
KHRKTGRDVA IK1IDFCLRFP TKQESQLRNE VAILQNLHHP GVVNLECMFE TPERVFWME 660 
KLHGDMLEMI LSSEKGRLPE HITKFLITQf LVALRHLHFK NIVHCDLKPE NVLLASADPF 720 
PQVKLCDFGF ARIIGEKSFR RS VVGTPAYL APEVLRNKGY NRSLDMWSVG VIIYVSLSGT 780 
FPFNEDEDIH DQIQNAAFMY PPNPWKEISH EAIDLINNLL QVKMRKRYS V DKTLSHPWLQ 840 
DYQTWLDLRE LECKIGERYI THESDDLRWE KYAGEQRLQY PTHLINPSAS HSDTPE7EET 900 
EMKALGERVS IE 



SEQ ID NO:137 PFH3 ONA SEQUENCE 

Nucleic Acid Accession #: X95425 

Coding sequence: 712*3825 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I 1 I I I 

AATGGTCAGT CAATACATTA TAACATAATA CACCAAATGC TAGAATAGAA GGGGAGGGGG 60 
GCACACATAA TGACTCACTG CTGGAAGAAG GGTGCATCAG TGAATTAAAA AATGTCCCTC 120 
CCCTCTTCAG CACTCAGCGC GC AGCTATTT CCTTCTGCCA GTCTCTTTGA ACTCTGGATC 1 80 
TTTGCTTTTG CTCGCTGCTC TCCTGTTTTT CATTCTCCAC ATTTTCTCAA TCCTCITTCT 240 
TTATCCTTAG CCACCCTGCT TTTTTCCTCC TTTTTTAAAA AATCGGAGAT TTCGTCTTAA 300 
AATGATTTGT CTTCCTTACC TTCGTCCATT TCAACACTGA AGGCTGCAAA GAACTTCACC 360 
TTTCCCCTAG TGGTATTTAA AAATTCTCAA TCCGTAAAAA GTCTTTTTGA AAGGCAAAGG 420 
AACAGGACCC AG ACCCTCTC G AC ACCCTTG ATCCGAGTCA G ATCTGCACT AGCAACC AG A 480 
ACTAATATTT CATTTAACCC ACCAAAAGGG GGAGGCGAGA GGAGCCAGAA GCAAACTTCA 540 
TCTGTCTCAG ACGG ATCCGT GGTTCCTACA TTTGGAGGAG CCGCGTGTCA GAAGGCGTAG 600 
GACCCCAAGG GGGGACAAGG AGGACTCCCG AGTCTCCCTT CTCCGCTCTC CGAGACCGAA 660 
GAGGTGGACT GAGCCGCTCG GGACAGCGGC ACCGGAGGAG GCTCGGAGAA GATG CGGGGC 720 
TCGGGGCCCC GGGGTGCGGG ACACCGGCGG CCCCCAAGCG GCGGCGGCGA CACCCCCATC 780 
ACCCCAGCGT CCCTGGCCGG CTGCTACTCT GCACCTCGAC GGGCTCCCCT CTGGACGTGC 840 
CTTCTCCTGT GCGCCGCACT CCGGACCCTC CTGGCCAGCC CCAGC AACGA AGTGAATTTA 900 
TTGGATTCAC GCACTGTCAT GGGGGACCTG GGATGGATTG CTTTTCCAAA AAATGGGTGG 960 
GAAGAGATTG GTGAAGTGG A TGAAAATTAT GCCCCTATCC ACACATACCA AGTATGCAAA 1020 
GTGATGGAAC AGAATCAGAA TAACTGGCTT TTGACCAGTT GGATCTCCAA TGAAGGTGCT 1080 
TCCAGAATCT TCATAGAACT CAAATTTACC CTGCGGGACT GCAACAGCCT TCCTGGAGGA 1 140 
CTGGGG ACCT GTAAGG AAAC CTTTAATATG TATTACTTTG AGTCAGATGA TCAGAATGGG 1 200 
AGAAACATCA AGGAAAACCA ATACATCAAA ATTGATACCA TTGCTGCCGA TGAAAGCTTT 1260 
ACAGAACTTG ATCTTGGTGA CCGTGTTATG AAACTGAATA CAGAGGTCAG AGATGTAGGA 1320 
CCTCTAAGCA AAAAGGGATT TTATCTTGCT TTTCAAGATG TTGGTGCTTG CATTGCTCTG 1380 
GnTCTGTGC GTGTATACTA TAAAAAATGC CCTTCTGTGG TACGACACTT GGCTGTCTTC 1440 
CCTGACACCA TCACTGGAGC TGATTCTTCC CAATTGCTCG AAGTGTCAGG CTCCTGTGTC 1500 
AACCATTCTG TGACCGATGA ACCTCCCAAA ATGCACTGCA GCGCCGAAGG GGAGTGGCTG 1560 
GTGCCCATCG GGAAATGCAT GTGCAAGGCA GGATATGAAG AGAAAAATGG CACCTGTCAA 1620 
GTGTGCAGAC CTGGGTTCTT CAAAGCCTCA CCTCACATCC AGAGCTGCGG CAAATGTCCA 1680 
CCTCACAGTT ATACCCATGA GGAAGCTTCA ACCTCTTGTG TCTGTG AAAA GGATTATTTC 1740 
AGGAGAGAGT CTGATCCACC CACAATGGCA TGCACAAGAC CCCCCTCTGC TCCTCGGAAT 1800 
GCCATCTCAA ATGTTAATGA AACTAGTGTC TTTCTGG AAT GG ATTCCGCC TGCTGACACT 1860 
GGTGGAAGGA AAGACGTGTC ATATT ATATT GCATGCAAGA AGTGCAACTC CCATGCAGGT 1920 
GTGTGTGAGG AGTGTGGCGG TCATGTCAGG TACCTTCCCC GGCAAAGCGG CCTGAAAAAC 1980 
ACCTCTGTCA TGATGGTGG A TCTACTCGCT CACACAAACT ATACCTTTGA GATTGAGGCA 2040 
GTGAATGGAG TGTCCGACTT GAGCCCAGG A GCCCGGCAGT ATGTGTCTGT AAATGTAACC 2100 
ACAAATCAAG CAGCTCCATC TCCAGTCACC AATGTGAAAA AAGGGAAAAT TGCAAAAAAC 2160 
AGCATCTCTT TGTCTTGGCA AGAACCAGAT CGTCCCAATG G AATCATCCT AGAGTATGAA 2220 
ATCAAGCATT TTG A A A AGG A CCA AG AG ACC AGCTACACGA TTATCAAATC TAAAGAGACA 2280 
ACTATTACTG CAGAGGGCTT G AA ACC AGCT TCAGTTTATG TCTTCCAAAT TCGAGCACGT 2340 
ACAGCAGCAG GCTATGGTGT CTTCAGTCGA AG ATTTGAGT TTGAAACCAC CCCAGTGTTT 2400 
GCAGCATCCA GCG ATC A A AG CCAGATTCCT GT AATTGCTG TGTCTGTGAC AGTAGGAGTC 2460 
ATTTTGTTGG CAGTGGTTAT CGGCGTCCTC CTCAGTGGA A GTTGCTGCG A ATGTGGCTGT 2520 
GGGAGGGCTT CTTCCCTGTG CGCTGTTGCC CATCCAATCC TAATATGGCG GTGTGGCTAC 2580 
AGCAAAGCAA AACAAG ATCC AGAAGAGG AA AAGATGCATT TTCATAATGG GCACATTAAA 2640 
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CTGCCAGGAG TAAGAACTTA CATTGATCCA CATACCTATG AGGATCCCAA TCAAGCTGTC 2700 
CACGAATTTG CCAAGGAGAT AGAAGCATCA TGTATCACCA TTGAGAGAGT TATTGGAGCA 2760 
GGTG A ATTTG GTGAAGTTTG TAGTGGACGT TTGAAACTAC CAGGAAAAAG AGAATTACCT 2820 
GTGGCTATCA AAACCCTTAA AGTAGGCTAT ACTGAAAAGC AACGCAGAGA TTTCCTAGGT 2880 
GAAGCAAGTA TC ATGGG AC A GTTTGATCAT CCTAACATCA TCCATTTAGA AGGTGTGGTG 2940 
ACCAAAAGTA AACCAGTGAT GATCGTGACA GAGTATATGG AGAATGGCTC TTTAGATACA 3000 
TTTTTGAAGA AAAACGATGG GC AGTTC ACT GTGATTCAGC TTGTTGGCAT GCTGAGAGGT 3060 
ATCTCTGCAG GAATGAAGTA CCTTTCTGAC ATGGGCTATG TGCATAGAGA TCTTGCTGCC 3120 
AGAAACATCT TAATCAACAG TAACCTTGTG TGCAAAGTGT CTGACTTTGG ACTTTCCCGG 3180 
GTACTGGAAG ATGATCCCGA GGCAGCCTAC ACCACAAGGG GAGGAAAAAT TCCAATCAGA 3240 
TGGACTGCCC CAGAAGCAAT AGCTTTCCGA AAGTTTACTT CTGCCAGTGA TGTCTGGAGT 3300 
TATGGAATAG TAATGTGGGA AGTTGTGTCT TATGGAGAGA GACCCTACTG GGAGATGACC 3360 
AATCAAGATG TGATTAAAGC GGTAGAGGAA GGCTATCGTC TGCCAAGCCC CATGGATTGT 3420 
CCTGCTGCTC TCTATCAGTT AATGCTGG AT TGCTGGCAG A AAGAGCG AAA TAGCAGGCCC 3480 
AAGTTTG ATG AAATAGTCAA CATGTTGGAC AAGCTGATAC GTAACCCAAG TAGTCTGAAG 3540 
ACGCTGGTTA ATGCATCCTG CAGAGTATCT AATTTATTGG CAGAACATAG CCCACTAGGA 3600 
TCTGGGGCCT ACAGATCAGT AGGTGAATGG CTAGAGGCAA TCAAGATGGG CCGGTATACA 3660 
GAGATTTTCA TGG A A A ATGG ATACAGTTCA ATGGACGCTG TGGCTCAGGT GACCTTGGAG 3720 
GATTTGAGAC GGCTTGGAGT GACTCTTGTC GGTCACCAGA AGAAGATCAT GAACAGCCTT 3780 
CAAGAAATGA AGGTGCAGCT GGTAAACGGA ATGGTGCCAT TGTAACTTCA TGTAAATGTC 3840 
GCTTCTTCAA GTGAATGATT CTGCACTTTG TAAACAGCAC TGAGATTTAT TTTAACAAAA 3900 
AAA 



SEQ ID NO:138 PFH3 Protein sequence: 
Protein Accession*: CAA64700.1 



I 11 21 31 41 51 
I 1 t I 1 I 

MRGSGPRGAG HRRPPSGGGD TPITPASLAG CYSAPRRAPL WTCLLLCAAL RTLLASPSNE 60 
VNLLDSRTVM GDLGWIAFPK NGWEEIGEVD ENYAPIHTYQ VCKVMEQNQN NWLLTSWISN 120 
EGASRIF1EL KFTLRDCNSL PGGLGTCKET FNMYYFESDD QNGRNIKENQ YIKIDT1AAD 180 
ESFTELDLGD RVMKLNTEVR DVGPLSKKGF YLAFQDVGAC IALVSVRVYY KKCPSVVRHL 240 
AVFPDTITGA DSSQIXEVSG SCVNHS VTDE PPKMHCSAEG EWLVPIGKCM CKAGYEEKNG 300 
TCQVCRPGFF KASPHIQSCG KCPPHS YTHE EASTSCVCEK DYFRRESDPP TMACTRPPSA 360 
PRNAISNVNE TSVFLEWIPP ADTGGRKDVS YYIACKKCNS HAGVCEECGG HVRYLPRQSG 420 
LKNTS VMMVD LLAHTNYTFE IEAVNGVSDL SPGARQYVSV NVTTNQAAPS PVTNVKKGKI 480 
AKNSISLSWQ EPDRPNGUL EYEIKHFEKD QETSYT1IKS KETTITAEGL KPAS VYVFQI 540 
RARTAAGYGV FSRRFEFETT PVFAASSDQS QIPVIAVSVT VGVILLAVVI GVLLSGSCCE 600 
CGCGRASSLC AVAHPILIWR CGYSKAKQDP EEEKMHFHNG H1KLPGVRTY IDPHTYEDPN 660 
QAVHEFAKE1 EASCIT1ERV IGAGEFGEVC SGRLKLPGKR ELPVAIKTLK VGYTEKQRRD 720 
FLGEASIMGQ FDHPNIIHLE G VVTKSKPVM IVTEYMENGS LDTFLKKNDG QFTVIQLVGM 780 
LRGIS AGMKY LSDMGYVHRD LAARNILINS NLVCKVSDFG LSRVLEDDPE AAYTTRGGKI 840 
PIRWTAPEAl AFRKFTSASD VWSYGIVMWE VVSYGERPYW EMTNQDVfKA VEEGYRLPSP 900 
MDCPAALYQL MLDCWQKERN SRPKFDEIVN MLDKL1RNPS SLKTLVNASC RVSNLLAEHS 960 
PLGSGAYRS V GEWLEAIKMG RYTEIFMENG YSSMDAV AQV TLEDLRRLGV TLVGHQKKIM 1020 
NSLQEMKVQL VNGMVPL 



SEQ ID NO:139 PFH2 DN A SEQUENCE 

Nucleic Acid Accession #: NMJJ16029 

Coding sequence: 78- 1 097 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I J I I i I 

CTGCGATCCC GCAGGGCAGC GACGCG ACTC TGGTGCGGGC CGTCTTCTTC CCCCCGAGCT 60 
GGGCGTGCGC GGCCGCAATQ AACTGGGAGC TGCTGCTGTG GCTGCTGGTG CTGTGCGCGC 120 
TGCTCCTGCT CTTGGTGCAG CTGCTGCGCT TCCTG AGGGC TG ACGGCGAC CTG ACGCTAC 1 80 
TATGGGCCG A GTGGCAGGGA CG ACGCCCAG AATGGGAGCT GACTGATATG GTGGTGTGGG 240 
TGACTGGAGC CTCGAGTGGA ATTGGTGAGG AGCTGGCTTA CCAGTTGTCT AAACTAGGAG 300 
TTTCTCTTGT GCTGTCAGCC AGAAGAGTGC ATGAGCTGGA AAGGGTGAAA AGAAGATGCC 360 
TAGAGAATGG CAATTTAAAA GAAAAAGATA TACTTGTTTT GCCCCTTGAC CTGACCGACA 420 
CTGGTTCCCA TGAAGCGGCT ACCAAAGCTG TTCTCCAGGA GTTTGGTAGA ATCGACATTC 480 
TGGTCAACAA TGGTGGAATG TCCCAGCGTT CTCTGTGCAT GGATACCAGC TTGGATGTCT 540 
ACAGAAAGCT AATAGAGCTT AACTACTTAG GGACGGTGTC CTTGACAAAA TGTGTTCTGC 600 
CTCACATGAT CGAG AGGAAG CAAGGAAAGA TTGTTACTGT GAATAGCATC CTGGGTATCA 660 
TATCTGTACC TCTTTCCATT GGATACTGTG CTAGCAAGCA TGCTCTCCGG GGTTTTTTTA 720 
ATGGCCTTCG AACAG AACTT GCCACATACC CAGGTATA AT AGTTTCTAAC ATTTGCCCAG 780 
GACCTGTGCA ATCAAATATT GTGGAGAATT CCCTAGCTGG AGAAGTCACA AAGACTATAG 840 
GCAATAATGG AGACCAGTCC CACAAGATGA CAACCAGTCG TTGTGTGCGG CTGATGTTAA 900 
TCAGCATGGC CAATGATTTG AAAGAAGTTT GGATCTCAGA ACAACCTTTC TTGTTAGTAA 960 
CATATTTGTG GCAATACATG CCAACCTGGG CCTGGTGGAT AACCAACAAG ATGGGGAAGA 1020 
AAAGGATTGA GAACTTTAAG AGTGGTGTGG ATGCAGACTC TTCTTATTTT AAAATCTTTA 1080 
AGACAAAACA TGACTQAAAA GAGCACCTGT ACTTTTCAAG CCACTGGAGG GAG A A ATGG A 1 140 
AAACATGAAA ACAGCAATCT TCTTATGCTT CTGAATAATC AAAGACTAAT TTGTGATTTT 1200 
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ACTTTTTAAT AGATATGACT TTGCTTCCAA CATGGAATGA AATAAAAAAT A A ATA ATA A A 1260 
AGATTGCCAT G AATCTTGCA A A 



SEQ 10 NO;t40 PFH2 Proton sequence: 
Protein Accession #: NP.0571 1 3. 1 

1 11 21 31 41 51 
I I 1 I I I 

MNWELLLWLL VLCALLLLLV QLLRFLRADG DLTLLWAEWQ GRRPEWELTD MVVWVTGASS 60 
GIGEELA YQL SKLG VSLVLS ARRVHELERV KRRCLENGNL KEKDILVLPL DLTDTGSHEA 1 20 
ATKAVLQEFG RIDILVNNGG MSQRSLCMDT SLDVYRKLIE LNYLGTVSLT KCVLPHMIER 180 
KQGKJVTVNS ILGHSVPLS IGYCASKHAL RGFFNGLRTE LATYPGIIVS NICPGPVQSN 240 
IVENSLAGEV TKTIGNNGDQ SHKMTTSRCV RLMLISMAND LKEVWISEQP FLLVTYLWQY 300 
MPTWAWW1TN KMGKKRIENF KSGVDADSSY FKIFKTKHD 



S£QIDN0:141 PFH1 DNA SEQUENCE 

Nucleic Acid Accession*: NM_021614 

Coding sequence: 1 -1 740 (underlined sequences correspond to start and stop cooons) 

1 11 21 31 41 51 
[(ill! 

ATG.AGCAGCT GCAGGTACAA CGGGGGCGTC ATGCGGCCGC TCAGCAACTT GAGCGCGTCC 60 
CGCCGGAACC TGCACGAGAT GGACTCAGAG GCGCAGCCCC TGCAGCCCCC CGCGTCTGTC 120 
GGAGGAGGTG GCGGCGCGTC CTCCCCGTCT GCAGCCGCTG CCGCCGCCGC CGCTGTTTCG 180 
TCCTCAGCCC CCG AG ATCGT GGTGTCTAAG CCCGAGCACA ACAACTCCAA CAACCTGGCG 240 
CTCTATGGAA CCGGCGGCGG AGGCAGCACT GGAGGAGGCG GCGGCGGTGG CGGGAGCGGG 300 
CACGGCAGCA GCAGTGGCAC CAAGTCCAGC AAAAAGAAAA ACCAGAACAT CGGCTACAAG 360 
CTGGGCCACC GGCGCGCCCT GTTCGAAAAG CGCAAGCGGC TCAGCGACTA CGCGCTCATC 420 
TTCGGCATGT TCGGC ATCGT GGTC ATGGTC ATCG AG ACCG AGCTGTCGTG GGGCGCCTAC 480 
GAC AAGGCGT CGCTGTATTC CTTAGCTCTG A AATGCCTTA TCAGTCTCTC CACGATCATC 540 
CTGCTCGGTC TGATCATCGT GTACCACGCC AGGGAAATAC AGTTGTTCAT GGTGGACAAT 600 
GGAGCAGATG ACTGGAGAAT AGCCATGACT TATGAGCGTA TTTTCTTCAT CTGCTTGG A A 660 
ATACTGGTGT GTGCTATTCA TCCCATACCT GGGAATTATA CATTCACATG GACGGCCCGG 720 
CTTGCCTTCT CCTATGCCCC ATCCACAACC ACCGCTGATG TGGATATTAT TTTATCTATA 780 
CCAATGTTCT TAAGACTCTA TCTGATTGCC AGAGTCATGC TTTTACATAG CAAACTTTTC 840 
ACTGATGCCT CCTCTAGAAG CATTGGAGCA CTTAATAAGA TAAACTTCAA TACACGTTTT 900 
GTTATGAAGA CTTTAATGAC TATATGCCCA GGAACTGTAC TCTTGGTTTT TAGTATCTCA 960 
TTATGGATAA TTGCCGCATG GACTGTCCGA GCTTGTGAAA GGTACCATGA TCAACAGGAT 1020 
GTTACTAGCA ACTTCCTTGG AGCGATGTGG TTGATATCAA TAACTTTTCT CTCCATTGGT 1080 
TATGGTGACA TGGTACCTAA CACATACTGT GGAAAAGGAG TCTGCTTACT TACTGGAATT 1 140 
ATGGGTGCTG GTTGCACAGC CCTGGTGGTA GCTGTAGTGG CAAGGAAGCT AGAACTTACC 1200 
AAAGCAGAAA AACACGTGCA CAATTTCATG ATGGATACTC AGCTGACTAA AAGAGTAAAA 1260 
AATGCAGCTG CCAATGTACT CAGGG AAACA TGGCTAATTT ACAAAAATAC AAAGCTAGTG 1 320 
AAA A AG AT AG ATCATGCAAA AGTAAGAAAA CATCAACGAA AATTCCTGCA AGCTATTCAT 1380 
CAATTAAGAA GTGTAAAAAT GGAGCAGAGG AAACTGAATG ACCAAGCAAA CACTTTGGTG 1440 
GACTTGGCAA AGACCCAGAA CATCATGTAT GATATGATTT CTGACTTAAA CGAAAGGAGT 1500 
GAAGACTTCG AGAAGAGGAT TGTTACCCTG GAAACAAAAC TAGAGACTTT GATTGGTAGC 1560 
ATCCACGCCC TCCCTGGGCT CATAAGCCAG ACCATCAGGC AGCAGCAGAG AGATTTCATT 1620 
GAGGCTCAGA TGGAGAGCTA CGACAAGCAC GTCACTTACA ATGCTGAGCG GTCCCGGTCC 1680 
TCGTCCAGGA GGCGGCGGTC CTCTTCCACA GCACCACCAA CTTCATCAGA GAGTAG CTAG 



SEQ ID NO:142 PFH1 Protein sequence: 
Protein Accession #: NP JK7627 

1 11 21 31 41 51 
[(III! 

MSSCRYNGGV MRPLSNLSAS RRNLHEMDSE AQPLQPPAS V GGGGGASSPS AAAAAAAAVS 60 
SSAPEIVVSK PEHNNSNNLA LYGTGGGGST GGGGGGGGSG HGSSSGTKSS KKKNQNIGYK 120 
LGHRR ALFEK RKRLSDYALI FGMFG1VVMV IETELSWG AY DKASLYSLAL KCLISLSTII 1 80 
LLGLIIVYHA REIQLFMVDN GADDWR1AMT YERIFFICLE ILVCAIHPIP GNYTFTWTAR 240 
LAFSYAPSTT TADVDI1LSI PMFLRLYLIA RVMLLHS KLF TD ASSRSIGA LNK1NFNTRF 300 
VMKTLMT1CP GTVLLVFSIS LW1IAAWTVR ACERYHDQQD VTSNFLG AMW LISITFLSIG 360 
YGDMVPNTYC GKGVCLLTGI MGAGCTALVV A WAR KLELT KAEKHVHNFM MDTQLTKR V K 420 
NAAANVLRET WLIYKNTKLV KKIDHAKVRK HQRKFLQAIH QLRSVKMEQR KLNDQANTLV 480 
DLAKTQNIMY DMISDLNERS EDFEKRtVTL ETKLETLIGS IHALPGLISQ TIRQQQRDFI 540 
EAQMESYDKH VTYNAERSRS SSRRRRSSST APPTSSESS 



SEQ ID NO:143 PFG9 DNA SEQUENCE 
Nucleic Acid Accession #: All 101 39, coding region is FGENESH predicted 
Coding sequence: 1 -1 696 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
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! I I I 1 I 

AISCGCGCCG TGCCGCTGCC CGCCCCGCTC CTGCCGCTGC TGCTGCTCGC GCTCCTGGCC 60 
GCTCCCGCCG CCCGCGCCAG CAGAGCCGAG TCCGTCTCCG CGCCGTGGCC CGAACCCGAG 120 
CGCGAGTCGC GGCCACCGCC CGGCCCGGGG CCCGGGAACA CCACCCGGTT TGGGTCTGGG 180 
5 GCGGCGGGCG GCAGCGGCAG CTCCAGCTCC AACAGCAGTG GCGACGCCTT GGTGACCCGC 240 
ATTTCCATCC TCCTCCGCGA CCTACCCACC CTCAAGGCAG CCGTGATCGT GGCGTTCGCC 300 
TTTACCACCC TCCTCATCGC CTGCCTGCTG CTGCGCGTCT TC AGGTCGGG AAAGAGGTTA 360 
A AG A AG ACAC GCAAGTATGA TATCATCACC ACTCCAGCAG AGCGAGTGGA AATGGCGCCA 420 
CTAAATGAAG AGGATGATGA AGATGAGGAC TCCACAGTAT TCGACATCAA ATACAGAGTG 480 

1 0 TCCTTGCCGG CTGC ACTG AG ACGTCAGCTG CCAGGGTGCC AGACGCTACT GACAGTTCCT 540 
GTGCCCCCAC CCTTCATCCT CGACATTGAC CTTCCAGCAA GATGCAGTGG AAGGCCTGAT 600 
GGTGGAATCA GACCTGGTAA AACCTGTTTC CC AGCCTGGT GGCATCCTGT GGAAAGTTGG 660 
TCAGCTGCAA CCTGGGGTGT GAAGGACTGG ACCTGGAAGC CCTCTTGCGT CGGAGGTGTT 720 
GAAACCAAAA CGAACGTTAT GTATAAAACC CCAGCTCCAT CGTGCGTGTC AGGCATCTGC 780 

1 5 TC AG ACTGTC ACTGGC A AGC TCGTTTCCAC GTCACCACAA TGGAGTTGCT TCTGCCACCC 840 
TTTGGGCATC CCTTTAAAGT GCCCCCTACT TCTACTCCCC ATGGTTTTCG ACAACTGCAG 900 
CTGAATCTCA TGGAAAAGCT GGATTCCTCT GCCTTACGCA GAAACACCCG GGCTCCATCT 960 
GCCAGGTGCT TGCCACTGGT CCTGGCAGAA ATGGCGGCTG CTGAAAGTGA CCTTCCAAAT 1020 
CCTTGGTGGC ACTTCAGCGC CACAGGCTCT CCAATAAAAA CCCTTTACAC ACAAACCATG 1080 

20 AGTACCTTGG GCTTGG ATGT TTTCTGTGGT GCCGGCCAGC GGGGCACCTT TTGTGAAGAC 1140 
AGAGCAGTGA CTAAGGTTCT CCAGGGTAGC TCTTTCTCCA AACAGCTGCG CTGGAAGCCA 1200 
GCCCTAGAGA GTGGGTTTCC CCATCATCTC AGGCTTCTCA GAGAGTGTCC TCCGCTGAGC 1260 
ACCCATCCTG TCAGGTTGGC TCGTTC AG AT GCCCGGGGAC AAGCCAGCCT GACGGGGAGG 1320 
AGGGTGTTTC GGCGTCCGCG GCAGTCTCTG CATGGCGGAG GGTCAGCGGG TACCGCAACT 1380 

25 TGCCTTTTGG TTTTGAAGAT TCTGTTGAGG CGCCATCCTC ACCTTGACCT CTTCTACAAA 1440 
ATCTGTCTCC CCTGCTGTGC CGTGGAACAC CTACGGGAAG CCAAGAGAAG CTCAGTGACT 1500 
GTCCTTGCGT CATTTGAGCA GAGCCCACAA AAGGCAGCTG CTGCCCACGG GGAGCCTGTC 1560 
AAACGAGGGC CCAGTGGGCA ATTGACCAGA CACACATGCC CTGGCTGGGG GATCACACAT 1620 
GCGAACCTGC AG ACAATTCC AGATACCCAA GGCCAGGAAG GCCCACGTGA GGATGTCACT 1680 

30 CACCCTGGAG GAGACTTGGA TGGGGTGGCA AATTTCTATT TGGAGGAAGA GGGTTTCC AG 1 740 
GATGGCAGAT GCCAG AAG AT GGTCCTGATG TCTGAGGAAG GGCCACCTAG TTTGACAGGA 1800 
TGTGAGAGGC TCACAGGTTC CCATCACTTC TCCAGCCATT CCAAGTCTTG GTCCTTCCTT 1860 
TCCCCCCGAC AGCCCCTGTT TCTGTCCAGG CCCTGA 

35 

SEQ ID NO:144 PFG9 Protein sequence: 

Protein Accession #: none available, FGENESH predicted 

1 11 21 31 41 51 

40 ) | | j | 1 

MRAVPLPAPL LPLLLLALLA APAARASRAE SVSAPWPEPE RESRPPPGPG PGNTTRFGSG 60 
AAGGSGSSSS NSSGDALVTR ISILLRDLPT LKAAVIVAFA FTTLLIACLL LRVFRSGKRL 120 
KKTRKYDUT TPAERVEMAP LNEEDDEDED STVFDIKYRV SLPAALRRQL PGCQTLLTVP 180 
VPPPFILDID LPARCSGRPD GGIRPGKTCF PAWWHPVESW SAATWGVKDW TWKPSCVGGV 240 

45 ETKTNVMYKT PAPSCVSGIC SDCHWQARFH VTTMELLLPP FGHPFKVPPT STPHGFRQLQ 300 
LNLMEKLDSS ALRRNTRAPS ARCLPLVLAE MAAAESDLPN PWWHFSATGS PIKTLYTQTM 360 
STLGLDVFCG AGQRGTFCED RAVTKVLQGS SFSKQLRWKP ALESGFPHHL RLLRECPPLS 420 
THPVRLARSD ARGQASLTGR RVFRRPRQSL HGGGSAGTAT CLLVLKILLR RHPHLDLFYK 480 
ICLPCCAVEH LREAKRSS VT VLASFEQSPQ KAAAAHGEPV KRGPSGQLTR HTCPGWGITH 540 

50 ANLQTIPDTQ GQEGPREDVT HPGGDLDGVA NFYLEEEGFQ DGRCQKMVLM SEEGPPSLTG 600 
CERLTGSHHF SSHSKSWSFL SPRQPLFLSR P 



- - SEQ ID NO:14S PFG6 0NA SEQUENCE 

55 Nucleic Acid Accession*: NM_013427 

Coding sequence: 675^3799 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
, n I I 1 I I I 

OU GGCTGGGCTG CGAATAGCGT GTTCCTCTCC GGCGGAACAC ACACACCCGG CCTTGGGGCT 60 
GTCTCCTGAA GCTCCCTCCT CCACGGAGAG CGCTGAGCGC CGCCGGGAAT TCCATCCCAC 120 
CGTGGGCACG CAGTCTTTGG AGGTCCCGGG CGCAGCACGC TCGGTGTCCC CACACTGCAG 180 
C AAG AC AG AG ACCCCGCGGG AACCTTGAGC TTGGAACAAC CCTTGAGCCT CTGCAGTCGG 240 
AAG AGTGGGC GCAGCAGCCC AGCGG AGGCC AGGCGCGCAA CCTCGGGCGC CGGGGCAAGG 300 

65 AGAGAGTGCA GGGAGGCGCA GCTCAGGCGC CCGGCTCAGG AGCGGGAGGA AGTTCTCGCG 360 
GCGCCGGGAG CGCGGTGGAC GCGCCCTGGG CGCACGCCCA GGCAGCCTTC TCCCTGGCCC 420 
TCGGG ACTGT CCTCGGGCGG CAAGGAGGAG CTTGCTGGAG TCTTAGAGGC CATCCAGAGC 480 
CAGCGAGCAG G A GCGCTGCG TCTCCCGCCT CAGCTAGGAA GGGGGAGTGG CGCTGGCAGG 540 
CTGGAGCTGG GAACCCAGCG AGCGCCTGAC CTTCCTCCTC CTCTTCCTGA CCCTCTTCGC 600 

70 GTCTTGGGCT CCGGAGG AAG GTTCTAGCGG CTGCAGGAGG TCCCCAGACC CATTTTCCTA 660 
GAAGGCTGGT GATGGATCTG CTGCTCCTGC CGCCGCCGGG GCACTTGGAG CGCACCGGCG 720 
GCGCGTGAGC TGGGCTTTGC TCTCCACCGC CCTGGGCAAA CCCCGGGCCA GCCCCGCCTG ISO 
GCACCTTTGC CTGAGTCCCT TTCGGTTCCC GACCCAAAGC CACCAGCGTC CAGGGAGGGA 840 
GG AGG AGGTG GTCCTCAGGT GCAGCCCCGC CGAGATQTCC GCGC AG AGCC TGCTCCACAG 900 

75 CGTCTTCTCC TGTTCCTCGC CCGCTTCAAG TAGCGCGGCC TCGGCCAAGG GCTTCTCCAA 960 

GAGGAAGCTG CGCCAGACCC GCAGCCTGGA CCCGGCCCTG ATCGGCGGCT GCGGG AGCG A 1020 
CGAGGCGGGC GCGGAGGGCA GTGCGCGGGG AGCCACGGCG GGCCGCCTCT ACTCCCCATC 1080 
ACTCCCAGCC GAGAGTCTCG GCCCTCGCTT GGCGTCCTCT TCCCGGGGTC CGCCCCCCAG 1 140 
GGCCACC AGG CTACCGCCTC CTGG ACCTCT TTGCTCGTCC TTCTCC ACAC CCAGCACCCC 1 200 
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GCAGGAGAAG TCACCATCCG GCAGCTTTCA CTTTGACTAT GAGGTTCCCC TGGGTCGCGG 1260 
CGGCCTCAAG AAGAGCATGG CCTGGGACCT GCCTTCTGTC CTGGCCGGGC CAGCCAGTAG 1320 
CCGAAGCGCT TCCAGCATCC TCTGTTCATC CGGGGGAGGC CCCAATGGCA TCTTCGCTTC 1380 
TCCTAGG AGG TGGCTCCAGC AGAGGAAGTT CCAGTCCCCA CCCGACAGTC GCGGGCACCC 1440 
5 CTACGTCGTG TGG AAATCCG AGGGTG ATTT C ACCTGG AAC AGC ATGTCAG GCCGCAGTGT 1 500 
GCGGCTGAGG TCAGTCCCCA TCCAGAGTCT CTCAGAGCTG GAGAGGGCCC GGCTGCAGGA 1560 
AGTGCCTTTT TATCAGTTGC AACAGGACTG TGACCTGAGC TGTCAGATCA CCATTCCCAA 1620 
AGATGGACAA AAGAGAAAGA AATCTTTAAG AAAGAAACTG GATTCACTAG GAAAGGAGAA 1680 
AA AC A A AG AC A A AG A ATTC A TCCCACAGGC ATTTGGAATG CCCTTATCCC AAGTCATTGC 1740 

1 0 G AATGAC AGG GCCTATAA AC TCAAGC AGG A CTTGCAG AGG GACG AGCAGA AAGATGCATC 1 800 
TGACTTTGTG GCTTCCCTCC TCCCATTTGG AAATAAAAGA CAAAACAAAG AACTCTCAAG 1860 
CAGTAACTCA TCTCTCAGCT CAACCTCAGA AACACCGAAT GAGTCAACGT CCCCAAACAC 1920 
CCCGGAACCG GCTCCTCGGG CTAGGAGGAG GGGTGCCATG TC AGTGG ATT CTATC ACCG A 1980 
TCTTGATG AC AATCAGTCTC GACTACTAGA AGCTTTACAA CTTTCCTTGC CTGCTGAGGC 2040 

1 5 TCAAAGTAAA AAGGAAAAAG CCAGAG ATAA GAAACTCAGT CTG AATCCTA TTTACAGACA 2100 
GGTCCCTAGG CTGGTGGACA GCTGCTGTCA GCACCTAGAA AAACATGGCC TCCAGACAGT 2160 
GGGGATATTC CGAGTTGG AA GCTCAAAAAA GAGAGTGAGA CAATTACGTG AGGAATTTGA 2220 
CCGTGGGATT GATGTCTCTC TGGAGGAGGA GCACAGTGTT CATG ATGTGG CAGCCTTGCT 2280 
G AAAG AGTTC CTG AGGGAC A TGCCAGACCC CCTTCTCACC AGGG AGCTGT AC ACAGCTTT 2340 

20 CATCAACACT CTCTTGTTGG AGCCGGAGGA ACAGCTGGGC ACCTTGCAGC TCCTCATATA 2400 
CCTTCTACCT CCCTGCA ACT GCGACACCCT CCACCGCCTG CTAC AGTTCC TCTCCATCGT 2460 
GGCCAGGC AT GCCGATGACA ACATC AGCAA AG ATGGGCAA GAGGTC ACTG GGAATAAAAT 2520 
GACATCTCTA AACTTAGCCA CCATATTTGG ACCCAACCTG CTGCACAAGC AG A AG TCATC 2580 
AGACAAAGAA TTCTCAGTTC AG AGTTCAGC CCGGGCTGAG GAGAGCACGG CCATCATCGC 2640 

25 TGTTGTGCAA AAGATGATTG AAAATTATGA AGCCCTGTTC ATGGTTCCCC CAGATCTCCA 2700 
GAACG AAGTG CTGATCAGCC TGTTAGAGAC CGATCCTGAT GTCGTGGACT ATTTACTCAG 2760 
A AGAAAGGCT TCCC AATCAT CA AGCCCTG A CATGCTGCAG TCGGAAGTTT CCTTTTCCGT 2820 
GGGAGGGAGG CATTCATCTA CAGACTCCAA CAAGGCCTCC AGCGGAGACA TCTCCCCTTA 2880 
TG ACAACAAC TCCCC AGTGC TGTCTG AGCG CTCCCTGCTG GCTATGCAAG AGGACGCGGC 2940 

30 CCCGGGGGGC TCGGAGAAGC TTTACAGAGT GCCAGGGCAG TTTATGCTGG TGGGCCACTT 3000 
GTCGTCGTCA A AG TC A AGGG AAAGTTCTCC TGGACCAAGG CTTGGGAAAG ATCTGTCAGA 3060 
GGAGCCTTTC GATATCTGGG GAACTTGGCA TTCAACATTA AAAAGCGGAT CCAAAGACCC 3120 
AGGAATGACA GGTTCCTCTG GAGACATTTT TG A A AGC AGC TCCCTAAGAG CGGGGCCCTG 3180 
CTCCCTTTCT CA AGGG AACC TGTCCCCAAA TTGGCCTCGG TGGCAGGGGA GCCCCGCAGA 3240 

35 GCTGGACAGC GACACGCAGG GGGCTCGGAG GACTCAGGCC GCAGCCCCCG CGACGGAGGG 3300 
CAGGGCCCAC CCTGCGGTGT CGCGCGCCTG CAGCACGCCC CACGTCCAGG TGGCAGGGAA 3360 
AGCCGAGCGG CCCACGGCCA GGTCGGAGCA GTACTTGACC CTGAGCGGCG CCCACGACCT 3420 
CAGCG AGAGT GAGCTGGATG TGGCCGGGCT GCAGAGCCGG GCCACACCTC AGTGCCAAAG 3480 
ACCCCATGGG AGTGGGAGGG ATG ACAAGCG GCCCCCGCCT CCATACCCGG GCCC AGGG A A 3540 

40 GCCCGCGGCA GCGGCAGCCT GG ATCCAGGG GCCCCCGGAA GGCGTGG AGA CACCCACGGA 3600 
CCAGGGAGGC CAAGCAGCCG AGCGAGAGCA GCAGGTCACG CAGAAAAAAC TGAGCAGCGC 3660 
CAACTCCCTG CCAGCGGGCG AGCAGGACAG TCCGCGCCTG GGGGACGCTG GCTGGCTCGA 3720 
CTGGCAGAGA GAGCGCTGGC AG ATCTGGGA GCTCCTGTCG ACCGACAACC CCGATGCCCT 3780 
GCCCG AG ACG CTGGTCTQAG CCCGCACCCA GCCGAGCCCC CCCTGCCCCG AGCCCCCCGC 3840 

45 CCTCCAGCCC AGGGGGGACC GTGGGTGGTG GCCACTGGCA CACTTAGTGT TCTTCTTTCA 3900 
CACTTCTCAA AAGTGACACA AGAGAAATCC AGTTCACCTA CAGAGGTAGA GCACTCACGC 3960 
CCCCGCCATT GAGAATAAGG TTCCATTGCG TAGCCAGCCT TAGGAAAAAC AAACAGAACC 4020 
CAAACCAGAT GGCAATGTCC AATCTAAAAA CGTCCCTCTT GGCTCTATAA TATAAG ATAC 4080 
AACTCTTGCT TGGTATAGCC TAACCGTATT TATGTGTCTT CGGTTTTGAC TATTGTGTAT 4140 

50 TCTGTAACAG ATTATGTATA ATCATATATG ATATATTCAC AAAGAGAAAA CAAAAGGAAC 4200 
TTTTAAAAAA AAAATCACTT CACTTATATT AAGCAATGAG ATATACTAAA CAATGAG ATT 4260 
CTATAGAATG TTCTAGAATG TGCACAAGCG GGTTTCTGTG CTTTTGCCAT AGCTTTATAA 4320 
CTGGGGATAA CCCTTCCTTC GATACCAAAC ACTAACAAGA GGAAGCAGAA TATGAGAAGC 4380 
CATATTTTTA CATAGG AGTC AGATACAAAA AGAAAAATCA CTGAATGCTT TTAGATATTG 4440 

55 AATACGTTTT CAGGAAAATG CTAAATCTGA TAGATTACGA AATATATTTT TAG AACTTGT 4500 

TTAGAAAGGA TTCAGTTAAC CAAACAAGAA AAAGGCAGTG CCTCACAAAG AAATTAAGAA 4560 
GTTGTCCGTC CCACGTTACA TCAAATTCAG TTTTATATAG GCCATATATA ATATATATTT 4620 
ATAATGTATA ATTTTTATGT ATTTTTCAAA ACTACAAACT GG AATCCAAC TATAAAGTGT 4680 
TTAAGAATCT AC AC AG A AT A TTCAAATTAT AGAACATGTT TTTTCCCTTT GCCCCATAAT 4740 

00 CAGTATTTGC CA AATTACAT GCAATTCCTT AA A AACTAAA TCACATTGGT AAAAGGCCTA 4800 
CAGCTTTGTA CTTACATTGT GCCAAAGGCT GAGGAAATGT TTTCTTTCGA ATTTTTATGT 4860 
GTATTGTAAA ATGTTCTACC GT ACTTTAGT AGTTTG AAGT TTTC AAGTGC ATAACTATTT 4920 
TTGACCAGCA GAAGGCGATA CGCTTCAGTA TTTTATGCAA TTTTTTTTCA CTTCGAAGGG 4980 
AAAGTGTATT ATAAAAAA AG ATTTTTTTTT TTTAAAACAT GCTACTCTTA ATTTTCATGT 5040 

65 TGGTGATGAA ATTCCCAGTG GTGTTTCTTA AGGTTCTATC TTGTGCCATG ATGAATAAAA 5100 
AGTTAAGCAA AAAAAAAAAA AAAAAAAAAA AAA 



SEQ ID N0:U6 PFG6 Protein sequence: 
Protein Accession*: NP_038286.1 

1 11 21 31 41 51 
I I i I I I 

MSAQSLLHS V FSCSSPASSS AASAKGFSKR KLRQTRSLDP ALIGGCGSDE AGAEGSARGA 60 
TAGRLYSPSL PAESLGPRLA SSSRGPPPRA TRLPPPGPLC SSFSTPSTPQ EKSPSGSFHF 120 
DYEVPLGRGG LKKSM AWDLP S VLAGPASSR S ASSILCSSG GGPNGIFASP RRWLQQRKFQ 1 80 
SPPDSRGHPY VVWKSEGDFT WNSMSGRSVR LRSVPIQSLS ELERARLQEV PFYQLQQDCD 240 
LSCQITIPKD GQKRKKSLRK KLDSLGKEKN KDKEF1PQAF GMPLSQVIAN DRAYKLKQDL 300 
QRDEQKDASD FVASLLPFGN KRQNKELSSS NSSLSSTSET PNESTSPNTP EPAPRARRRG 360 



357 



Attorney Docket No.: 018501-004200US 



AMS VDSITDL DDNQSRLLEA LQLSLPAEAQ SKKEKARDKK LSLNPIYRQV PRLVDSCCQH 420 
LEKHGLQTVG IFRVGSSKKR VRQIJREEFDR GIDVSLEEEH SVHDVAALLK EFLRDMPDPL 480 
LTRELYTAFI NTLLLEPEEQ LGTLQLLIYL LPPCNCDTLH RLLQFLSIVA RHADDNISKD 540 
GQEVTGN KMT SLNLAT1FGP NLLHKQKSSD KEFSVQSSAR AEESTAIIAV VQKMIENYEA 600 
5 LFMVPPDLQN EVLISLLETD PDVVDYLLRR KASQSSSPDM LQSEVSFSVG GRHSSTDSNK 660 
ASSGDISPYD NNSPVLSERS LLAMQEDAAP GGSEKLYRVP GQFMLVGHLS SSKSRESSPG 720 
PRLGKDLSEE PFDIWGTWHS TLKSGSKDPG MTGSSGDIFE SSSLRAGPCS LSQGNLSPNW 780 
PRWQGSPAEL DSDTQG ARRT QAAAPATEGR AHPAVSRACS TPHVQVAGKA ERPTARSEQY 840 
LTLSGAHDLS ESELDVAGLQ SRATPQCQRP HGSGRDDKRP PPPYPGPGKP AAAAAWIQGP 900 
1 0 PEGVETPTDQ GGQAAEREQQ VTQKKLSS AN SLPAGEQDSP RLGDAGWLDW QRERWQIWEL 960 
LSTDNPDALP ETLV 



15 SEQ ID NO: 147 PFG4 DNA SEQUENCE 

Nucleic Acid Accession #: NMJXJ2202 

Coding sequence: 240-1289 (underlined sequences correspond to start and stop codons) 



20 



1 II 21 31 41 51 
i I I I I 



CCCCCGAGCC GCGCCGAGTC TGCCGCCGCC GCAGCGCCTC CGCTCCGCCA ACTCCGCCGG 60 
CTTAAATTGG ACTCCTAGAT CCGCGAGGGC GCGGCGCAGC CGAGCAGCGG CTCTTTCAGC 120 
' ATTGGC A ACC CCAGGGGCCA AT ATTTCCCA CTTAGCC ACA GCTCC AGCAT CCTCTCTGTG 1 80 

25 GGCTGTTCAC CAACTGTACA ACCACCATTT CACTGTGG AC ATTACTCCCT CTTACAG ATA 240 
TGGGAGACAT GGGAGATCCA CCAAAAAAAA AACGTCTGAT TTCCCTATGT GTTGGTTGCG 300 
GCAATCAGAT TCACGATCAG TATATTCTGA GGGTTTCTCC GGATTTGGAA TGGCATGCGG 360 
CATGTTTGAA ATGTGCGGAG TGTAATCAGT ATTTGGACGA GAGCTGTACA TGCTTTGTTA 420 
GGGATGGGAA A ACCTACTGT A A A AGAGATT ATATC AGGTT GTACGGGATC AAATGCGCCA 480 
30 AGTGCAGCAT CGGCTTCAGC AAGAACGACT TCGTGATGCG TGCCCGCTCC AAGGTGTATC 540 
ACA TCGAGTG TTTCCGCTGT GTGGCCTGCA GCCGCCAGCT CATCCCTGGG GACGAATTTG 600 
CGCTTCGGGA GG ACGGTCTC TTCTGCCG AG CAG ACCACGA TGTGGTGGAG AGGGCC AGTC 660 
TAGGCGCTGG CGACCCGCTC AGTCCCCTGC ATCCAGCGCG GCCACTGCAA ATGGCAGCGG 720 
AGCCCATCTC CGCCAGGCAG CCAGCCCTGC GGCCCCACGT CCACAAGCAG CCGG AGAAGA 780 
3 5 CCACCCGCGT GCGG ACTGTG CTG AACGAGA AGCAGCTGCA CACCTTGCGG ACCTGCTACG 840 
CCGCAAACCC GCGGCCAGAT GCGCTCATGA AGGAGCAACT GGTAGAGATG ACGGGCCTCA 900 
GTCCCCGTGT GATCCGGGTC TGGTTTCAAA ACAAGCGGTG CAAGGACAAG AAGCGAAGCA 960 
TCATGATGAA GCAACTCCAG CAGCAGCAGC CCAATGACAA AACTAATATC CAGGGGATGA 1020 
CAGGAACTCC C ATGGTGGCT GCC AGTCCAG AG AGACACGA CGGTGGCTTA CAGGCTAACC 1 080 
40 CAGTGGAAGT ACAAAGTTAC CAGCCACCTT GGAAAGTACT GAGCGACTTC GCCTTGCAGA 1 140 
GTGACATAGA TCAGCCTGCT TTTCAGCAAC TGGTCAATTT TTCAGAAGGA GGACCGGGCT 1200 
CTAATTCCAC TGGCAGTGAA GTAGCATCAA TGTCCTCTCA ACTTCCAGAT ACACCTAACA 1260 
GCATGGTAGC CAGTCCTATT GAGGCAJGAG GAACATTCAT TCTGTATTTT TTTTCCCTGT 1320 
TGGAGAAAGT GGGAAATTAT AATGTCGAAC TCTGAAACAA AAGTATTTAA CGACCCAGTC 1380 
45 AATGAAAACT G AATCAAGAA ATGAATGCTC CATGAAATGC ACGAAGTCTG TTTTAATGAC 1440 
AAGGTGATAT GGTAGCAACA CTGTG A AG AC AATCATGGGA TTTTACTAGA ATTAAACAAC 1500 
AAACAAAACG CAAAACCCAG TATATGCTAT TCAATGATCT TAGAAGTACT GAAAAAAAAA 1560 
GACGTTTTTA AAACGTAGAG GATTTATATT CAAGGATCTC AAAGAAAGCA TTTTCATTTC 1620 
ACTGCACATC TAGAGAAAAA CAAAAATAGA AAATTTTCTA GTCCATCCTA ATCTGAATGG 1680 
50 TGCTGTTTCT ATATTGGTCA TTGCCTTGCC AAACAGGAGC TCCAGCAAAA GCGCAGGAAG 1740 
AGAGACTGGC CTCCTTGGCT GAAAGAGTCC TTTCAGGAAG GTGGAGCTGC ATTGGTTTGA 1800 
TATGTTTAAA GTTGACTTTA ACAAGGGGTT AATTGAAATC CTGGGTCTCT TGGCCTGTCC 1860 
TGTAGCTGGT TTATTTTTTA CTTTGCCCCC TCCCCACTTT TTTTGAGATC CATCCTTTAT 1920 
CAAGAAGTCT GAAGCGACTA TAAAGGTTTT TGAATTCAGA TTTAAAAACC AACTTATAAA 1980 
5 5 GCATTGC AAC AAGGTTACCT CTATTTTGCC ACAAGCGTCT CGGGATTGTG TTTGACTTGT 2040 
GTCTGTCCAA GAACTTTTCC CCCAAAGATG TGTATAGTTA TTGGTTAAAA TGACTGTTTT 2100 
CTCTCTCTAT GGAAATAAAA AGGAAAAAAA AAAGGAAACT TTTTTTGTTT GCTCTTGCAT 2160 
TGCAAAAATT ATAAAGTAAT TTATTATTTA TTGTCGGAAG ACTTGOCACT TTTCATGTCA 2220 
TTTGACATTT TTTGTTTGCT GAAGTGAAAA AAAAAGATAA AGGTTGTACG GTGGTCTTTG 2280 
60 AATTATATGT CTAATTCTAT GTGTTTTGTC TTTTTCTTAA ATATTATGTG AAATCAAAGC 2340 
GCCATATGTA GAATTATATC TTCAGG ACTA TTTCACTAAT AAACATTTGG CATAGAT 



65 SEQ ID NO:148 PFG4 Protein sequence: 
Protein Accession #: NP_0021 93.1 

1 11 21 31 41 51 
- A I I I I I I 

/ 0 MGDPPKKKRL ISLCVGCGNQ IHDQYILRVS PDLEWHAACL KCAECNQYLD ESCTCFVRDG 60 
KTYCKRDYIR LYGIKCAKCS IGFSKNDFVM RARSKVYHIE CFRCVACSRQ LTPG DEFALK 120 
EDGLFCRADH DV VERASLGA GDPLSPLHPA RPLQMAAEPI SARQPALRPH VHKQPEKTTR 1 80 
VRTVLNEKQL HTLRTCYAAN PRPDALMKEQ LVEMTGLSPR VIRVWFQNKR CKDKKRSIMM 240 
KQLQQQQPND KTNIQGMTGT PM VAASPERH DGGLQANPVE VQS YQPPWKV LSDFALQSDI 300 

75 DQPAFQQLVN FSEGGPGSNS TGSEVASMSS QLPDTPNSMV ASPIEA 
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SEQ ID NO:149 PFG2 DNA SEQUENCE 

Nucleic Acid Accession #: NMJJ01172 

Coding sequence: 39-1 1 03 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
i I I I I I 

GCGGAGCTCT GCCTTGG AG A TTCTCAGTGC TGCGGAT CAT G TCCCTAAGG GGCAGCCTCT 60 
CGCGTCTCCT CCAG ACGCG A GTGCATTCCA TCCTGAAGAA ATCCGTCCAC TCCGTGGCTG 1 20 
TGATAGGAGC CCCGTTCTCA CAAGGGCAGA A A AG A A A AGG AGTGGAGCAT GGTCCCGCTG 180 
CCATAAGAGA AGCTGGCTTG ATGAAAAGGC TCTCC AG TTT GGGCTGCCAC CTAAAAGACT 240 
TTGGAGATTT GAGTTTTACT CCAGTCCCCA AAGATGATCT CTACAACAAC CTGATAGTGA 300 
ATCCACGCTC AGTGGGTCTT GCCAACCAGG AACTGGCTG A GGTGGTTAGC AGAGCTGTGT 360 
CAGATGGCTA CAGCTGTGTC ACACTGGGAG GAGACCACAG CCTGGCAATC GGTACCATTA 420 
GTGGCCATGC CCG ACACTGC CCAGACCTTT GTGTTGTCTG GGTTGATGCC CATGCTGACA 480 
TCAACACACC CCTTACCACT TCATCAGGAA ATCTCCATGG ACAGCCAGTT TCATTTCTCC 540 
TCAGAGAACT ACAGGATAAG GTACCACAAC TCCCAGG ATT TTCCTGGATC AAACCTTGTA 600 
TCTCTTCTGC AAGT ATTGTG TATATTGGTC TGAG AG ACGT GG ACCCTCCT GAACATTTTA 660 
TTTTAAAGAA CTATGATATC CAGTATTTTT CCATGAGAGA TATTGATCGA CTTGGTATCC 720 
AG A AGGTC AT GGAACGAACA TTTGATCTGC TGATTGGCAA GAGACAAAGA CCAATCCATT 780 
TGAGTTTTGA TATTGATGCA TTTGACCCTA CACTGGCTCC AGCCACAGGA ACTCCTGTTG 840 
TCGGGGGACT AACCTATCGA GAAGGCATGT ATATTGCTGA GGAAATACAC AATACAGGGT 900 
TGCTATCAGC ACTGG ATCTT GTTGAAGTCA ATCCTCAGTT GGCCACCTCA GAGGAAGAGG 960 
CGAAGACTAC AGCTAACCTG GCAGTAGATG TGATTGCTTC AAGCTTTGGT CAGACAAGAG 1020 
AAGGAGGGCA TATTGTCTAT GACCAACTTC CTACTCCCAG TTCACCAGAT GAATCAGAAA 1080 
ATCA AGCACG TGTG AGAATT TAGG AG ACAC TGTGC ACTG A C ATGTTTCAC AACAGGC ATT 1 140 
CCAGAATTAT GAGGCATTGA GGGGATAGAT GAATACTAAA TGGTTGTCTG GGTCAATACT 1200 
GCCTTAATGA GAACATTTAC ACATTCTCAC AATTGTAAAG TTTCCCCTCT ATTTTGGTGA 1260 
CCAATACTAC TGT A A ATGTA TTTGGTTTTT TGCAGTTCAC AGGGTATTA A TATGCTACAG 1 320 
TACTATGTAA ATTTAAAGAA GTCATAAACA GCATTTATTA CCTTGGTATA TCATACTGGT 1380 
CTTGTTGCTG TTGTTCCTTC ACATTTAAGT GGTTTTTCAT CTTTCCTCCC TCCTCCCACA 1440 
GCCTGGCTAT ACAGTGCATC CTTGAACTGT CAGCCCACAG CAGCAATATG CTTATTCTAT 1500 
CCACATCCCT AACATCATGC ATTCACAAGG TCAAAGTTCT GGTCCACAAA CCCTTCCCTA 1560 
TAGAAGTTCA ATGGCTGCGA AAGAATTTGT AGTAAACCAG GCCTCCCAGG ATGGCGAGCT 1620 
CCAGTAAGAT GATAATGGAA AGCAGCAGCT TGTTGGTTGT CACTCTACAA AGAGAAGCAA 1680 
AGTGGGGAGT AGTCAGAAGT TTGG ATAACC TTCCTTCTAA ACATTTGGGG GTTAGACCTG 1740 
GGACCACGGC TGGATACTCT GAGGCTGTAT GTTTGATCAC ACAGCCACTT AGC AGG AAGT 1800 
ACTCATAAGG TTCTTTAGCT GTCACTTAGG GATAACACTG TCTACCTCAC AGAAATGTTA 1860 
AACTGAGACA ATAAAACCCA AAGCAT 



SEQ ID NO:150 PFG2 Protein sequence: 
Protein Accession #: NP_001 1 63.1 

1 11 21 31 41 51 
I I i i I I 

MSLRGSLSRL LQTRVHSILK KSVHSVA VIG APFSQGQKRK GVEHGPAAIR EAGLMKRLSS 60 
LGCHLKDFGD LSFTPVPKDD LYNN LI VN PR SVGLANQELA EVVSRAVSDG YSCVTLGGDH 120 
SLAIGTISGH ARHCPDLCV V WVDAHADIKT PLTTSSGNLH GQPVSFLLRE LQDKVPQLPG 1 80 
FS WIKPCISS ASIVYfGLRD VDPPEHFILK NYDIQYFSMR DIDRLGIQKV MERTFDLLIG 240 
KRQRPIHLSF DID AFDPTLA PATGTPVVGG LTYREGMYIA EEIHNTGLLS ALDLVEVNPQ 300 
LATSEEEAKT TANLAVDVIA SSFGQTREGG HIVYDQLPTP SSPDESENQA RVRI 



SEQ ID NO: 151 PFG1 DNA SEQUENCE 

Nucleic Acid Accession #: NMJ317906 

Coding sequence: 80- 1 255 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
1111(1 

AATTATATAT TTTTACTCTA TGTTTCTCTA CATGTTTTTT TCTTTCCGTT GCTGGCGGAA 60 
GAGGCACGTG CGCTGCTGA A TGG AGCTGGT CGCTGGTTGC TACG AGCAGG TCCTCTTTGG 1 20 
GTTCGCTGTA CACCCGGAGC CCAAGGCTTG CGGCGACCAC GAGCAATGGA CTCTTGTGGC 1 80 
TGACTTCACT CACCATGCTC ACACTGCCTC CTTGTCAGCA GTAGCTGTAA ATAGTCGTTT 240 
TGTGGTCACT GGGAGCAAAG ATGAAACAAT TCACATTTAT GAC ATGAAAA AGAAGATTGA 300 
GCATGGGGCT CTAGTGCATC ACAGTGGTAC AATAACTTGC CTGAAATTCT ATGGCAACAG 360 
GCATTTAATC AGTGGAGCGG AAGATGGACT CATCTGTATC TGGGATGCAA AGAAATGGGA 420 
ATGCCTGAAG TCAATTAA AG CTCAC AA AGG ACAGGTGACC TTCCTTTCTA TTCACCC ATC 480 
TGGCAAGTTG GCCCTGTCGG TTGGTACAGA TAAAACTTTA AGAACGTGGA ATCTTGTAGA 540 
AGGAAGATCA GCATTCATAA AAAATATAAA ACAAAATGCT CACATAGTAG AATGGTCCCC 600 
AAGAGGAGAG CAGTATGTAG TTATCATACA GAATAAAATA GACATCTATC AGCTTGACAC 660 
TGCATCCATT AGTGGCACCA TCACAAATGA AAAGAGAATT TCCTCTGTTA AATTTCTTTC 720 
AGAGTCTGTC CTTGCAGTGG CTGGAGATGA AGAAGTTATA AGGTTTTTTG ACTGTGATTC 780 
ACTAGTGTGC CTCTGCGAAT TTAAAGCTCA TGAAAACAGG GTAAAGGACA TGTTCAGTTT 840 
TGAAATTCCA GAGCATCATG TTATTGTTTC AGCATCGAGT GATGGTTTCA TCAAAATGTG 900 
GAAGCTTAAG CAGG ATAAG A AAGTTCCCCC ATCTTTACTC TGTGAAATAA ACACTAATGC 960 
CAGGCTGACG TGTCTTGGAG TGTGGCTAGA CAAAGTGGCA GACATGAAAA GCCTTCCTCC 1020 
AGCTGCAGAG CCTTCTCCTG TAAGTAAAGA ACAGTCCAAA ATTGGCAAAA AGG AGCCTGG 1080 
TGACACAGTG CACAAAGAAG AAAAGCGGTC AAAACCTAAC ACAAAGAAAC GCGGTTTAAC 1 140 
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AGGTGACAGT A AG A A AGC A A CAAAAGAAAG TGGCCTGATA TCAACCAAGA AGAGGAAAAT 1200 
GGTAGAAATG TTGGAAAAGA AGAGGAAAAA GAAGAAAATA AAAACAATGC AGTQAATCAC 1260 
AGATGTCTCC TGAA AG AACT CTTTTAGATG AAATCATTCT ACTCAAATGT ACCTTAATTT 1320 
TTTTTTTTCC CTG AGTA A A A GC A AG A A ATT TCTTCCTTTG G AA A A A ATAT ATATATTAA A 1 380 
5 A AACCACTTT TAGATGGTTT TTTTTAAAAA AAAAAAAAAA ACTGGTAAAA TTACTTTTGG 1440 
CAGACAGTGT TTTATGAATT ATGTATCATG TTGATATATA ATATGTTAAT GTGTCATGTA 1500 
ATTTTTACTT TGTACAAAGC AAATAAAGAT CTTTCTCAAA AAAAAAAAAA AAAA 



10 

SEQ fD NO:152 PFG1 Protein sequence: 
Protein Accession #: NP_060376.1 

1 11 21 31 41 51 

15 | | I | | | 

MELVAGCYEQ VLFGFAVHPE PKACGDHEQW TLVADFTHHA HTASLSAVAV NSRFVVTGSK 60 
DETIHIYDMK KKIEHGALVH HSGTITCLKF YGNRHLISGA EDGLICIWDA KKWECLKSIK 120 
AHKGQVTFLS IHPSGKLALS VGTDKTLRTW NLVEGRSAFI KNIKQNAHIV EWSPRGEQYV 180 
VIIQNKIDIY QLDTASISGT ITNEKRISS V KFLSESVLAV AGDEEVIRFF DCDSLVCLCE 240 

20 FKAHENRVKD MFSFEIPEHH V1VSASSDGF IKMWKLKQDK KVPPSLLCEI NTNARLTCLG 300 

VWLDKVADMK SLPPAAEPSP VSKEQSKIGK KEPGDTVHKE EKRSKPNTKK RGLTGDSKKA 360 
TKESGUSTK KRKMVEMLEK KRKKKKIKTM Q 



25 

SEQ ID NO:153 PFD6 DNA SEQUENCE 

Nucleic Acid Accession #: NM_014668 

Coding sequence: 1 1 0-2953 {underlined sequences correspond to start and stop codons) 

30 1 11 21 31 41 51 
I I I 1 I I 

GATGTCTTGG ACATGCTCTG GCTGGCTAAT CTCCATGTTC TAGCCGACTG AAAATACGGT 60 
GGCCAAGTGG ATGGTGTGCT TATTTGCAGT CTAAAGAAAT TTCCTTTTGA_TGTGGCAGAA 120 
AATCGAGGAT GTGGAGTGGA GACCCCAGAC TTACTTGGAG CTGGAGGGTC TGCCTTGCAT 180 

3 5 CCTG ATCTTC AGTGGG ATGG ACCCGCATGG GG AGTCCTTG CCGAGGTCTT TG AGGTACTG 240 
TGACCTGCGA TTGATAAACT CCTCCTGCTT GGTGAGAACA GCCTTGGAGC AGGAGCTGGG 300 
CCTGGCTGCC TACTTTGTGA GCAACGAGGT TCCCTTGGAG AAGGGGGCTA GGAACGAGGC 360 
CTTGGAGAGT G ATGCTG AG A AGCTGAGCAG CACAGACAAC GAGGATGAGG AGCTGGGGAC 420 
AGA AGGCTCT ACCTCGGAG A AG AGAAGCCC CATGAA A AGG GAGAGGTCCC GCTCCCACGA 480 

40 CTCAGCATCC TCATCCCTCT CCTCCAAGGC TTCCGGTTCA GCGCTCGGTG GCGAGTCCTC 540 
GGCTCAGCCC ACAGCACTCC CCCAGGGAGA GCATGCCAGG TCGCCCCAGC CCCGTGGCCC 600 
CGCAGAGGAG GGCAGAGCCC CTGGTGAGAA ACAGAGGCCC CGGGCAAGTC AGGGGCCACC 660 
CTCGGCCATC AGCAGGCACA GTCCCGGGCC GACGCCCCAG CCCGACTGTA GCCTC AGG AC 720 
CGGCCAGAGG AGCGTCCAGG TGTCGGTCAC CTCGTCGTGC TCCCAGCTGT CCTCCTCCTC 780 

45 GGGCTCATCC TCCTCATCCG TGGCGCCCGC TGCCGGCACG TGGGTCCTGC AGGCCTCCCA 840 
GTGCTCCTTG ACCAAGGCCT GCCGCCAGCC ACCCATTGTC TTCTTGCCCA AGCTCGTGTA 900 
CGACATGGTT GTGTCCACTG ACAGCAGTGG CCTGCCC A AG GCCGCCTCCC TCCTGCCCTC 960 
CCCCTCGGTC ATGTGGGCCA GCTCTTTCCG CCCCCTGCTC AGCAAGACCA TGACATCCAC 1020 
CGAGCAGTCC CTCTACTACC GGCAGTGGAC GGTGCCCCGG CCCAGCCACA TGGACTACGG 1080 

50 CAACCGGGCC GAGGGCCGCG TGG ACGGCTT CCACCCCCGC AGGCTGCTGC TCAGCGGCCC 1140 
CCCTCAGATC GGGAAGACAG GTGCCTACCT GCAGTTCCTC AGTGTCCTGT CCAGGATGCT 1200 
TGTTCGGCTC ACAGAAGTGG ATGTCTATGA CGAGGAGGAG ATCAATATCA ACCTCAGAGA 1260 
AGAATCTGAC TGGCATTATC TCCAGCTTAG CGACCCCTGG CCAGACCTGG AGCTGTTCAA 1320 
GAAGTTGCCC TTTGACTACA TCATTCACGA CCCGAAGTAT GAAGATGCCA GCCTGATTTG 1380 

5 5 TTCGCACTAT CAGGGTATAA AG AGTG AAG A C AG AGGG ATG TCCCGGAAGC CGG AGG ACCT 1440 
TTATGTGCGG CGTCAGACGG CACGGATGAG ACTGTCCAAG TACGCAGCGT ACAACACTTA 1500 
CCACCACTGT GAGCAGTGCC ACCAGTACAT GGGCTTCCAC CCCCGCTACC AGCTGTATGA 1560 
GTCCACCCTG CACGCCTTTG CCTTCTCTTA CTCCATGCTA GGAGAGGAGA TCCAGCTGCA 1620 
CTTCATCATC CCCAAGTCCA AGGAGCACCA CTTTGTCTTC AGCCAACCTG GAGGCCAGCT 1680 

60 GGAGAGCATG CGACTACCCC TCGTGACAGA CAAGAGCCAT GAATATATAA AAAGTCCGAC 1740 
ATTCACTCCA ACCACCGGCC GTCACGAACA TGGGCTCTTT AATCTGTACC ACGCAATGGA 1800 
CGGTGCCAGC CATTTGCACG TGCTGGTTGT CAAGGAATAC GAGATGGCAA TTTATAAGAA 1860 
ATATTGGCCC AACCACATCA TGCTGGTGCT CCCCAGTATC TTCAACAGTG CTGGAGTTGG 1920 
TGCTGCTCAT TTCCTCATCA AGGAGCTGTC CTACCATAAC CTGGAGCTCG AGCGGAACCG 1980 

65 GCAGG AGG AG CTGGGAATCA AGCCGCAGGA CATCTGGCCT TTCATTGTGA TCTCTGATG A 2040 
CTCCTGCGTG ATGTGG A ACG TGGTGGATGT CAACTCTGCT GGGGAGAGAA GCAGGGAGTT 2100 
CTCCTGGTCG GAAAGGAACG TGTCTTTGAA GCACATCATG CAGCACATCG AGGCGGCCCC 2160 
CGACATCATG CACTACGCCC TGCTGGGCCT GCGGAAGTGG TCCAGCAAGA CCCGGGCCAG 2220 
CG AGGTGCA A G AGCCCTTCT CCCGCTGCCA CGTGCACAAC TTCATCATCC TGA ACGTGGA 2280 

70 CCTGACCCAG AACGTGCAGT AC AACCAG AA CCGGTTCCTG TGTGACGATG TAGACTTCAA 2340 
CCTGCGGGTG CACAGCGCCG GCCTCCTGCT CTGCCGGTTC AACCGCTTCA GCGTGATGAA 2400 
GAAGCAGATC GTGGTGGGCG GCCACAGGTC CTTCCACATC ACATCCAAGG TGTCTG ATA A 2460 
CTCTGCCGCG GTCGTGCCGG CCCAGTACAT CTGTGCCCCG GACAGCAAGC ACACGTTCCT 2520 
CGCAGCGCCC GCCC AGCTCC TGCTGG AGAA GTTCCTGC AG C ACCAC AGCC ACCTCTTCTT 2580 

75 CCCGCTGTCC CTGAAGAACC ATGACCACCC AGTGCTGTCT GTCGACTGTT ACCTGAACCT 2640 
GGG ATCTCAG ATTTCTGTTT GCTATGTGAG CTCCAGGCCC CACTCTTTAA ACATCAGCTG 2700 
CTCGGACTTG CTGTTCAGTG GGCTGCTGCT GTACCTCTGT GACTCTTTTG TGGGAGCTAG 2760 
C1T I TIG AAA AAGTTTC ATT TTCTGA AAGG TGCG ACGTTG TGTGTCATCT GTCAGGACCG 2820 
GAGCTCACTG CGCCAGACGG TCGTCCGCCT GGAGCTCGAG GACGAGTGGC AGTTCCGGCT 2880 
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GCGCGATGAG TTCCAGACCG CCAATGCCAG GGAAGACCGG CCGCTCTTTT TTCTGACGGG 2940 
ACGACACATC TGAGG AAG AC AGCGGCGAGT TTTCTG AAG A GATGAGTGCT CAGAGCCCTC 3000 
ATGCTGTTGA GGCTAAAGGG AGGCCTGGAA CGGTGGGGCG TTTGACTGGA ATGGACCCCA 3060 
GGGACTGTCC AGGTGCAGCC CCTCCTAGTA CACATGGGCC CCCGAGGCCG TGGTCCTGGG 3120 
AGCCAGGAAG ACTCCGCAGT GGGTGAGAAT GAAAACTTGA GACTCCCAAG TTCTGGGCCA 3180 
GCCCATTGCT CTGGGCTGTT TTAAAGCCCA TTTCACG AGG AACAAAGATT TACTTCCTGT 3240 
CCTGCCATTC GTGTGCTTCC ATGGACAAAC CTGAITHTT TCTCTTAGTT CTAAAGAATC 3300 
TTGGGTTATT TTGTAGCGGT GCCAGTATTT CAGTAGATGG G ATTTCAGCC AAGTAGGTTC 3360 
CCCTGTAACC TCCTACAAAG CAATATTCCA AAGGAACATT TTAACTGTAA AGGCTGGAGA 3420 
CAAGAAAAAA TAAGTAGATC GTTTTAATAA CAATTATTTA ATTGCCTATA AGTTTGCTGT 3480 
TTCAGAGGCT AGCCCAAAGG CATCA A ATTT AATAAAGTTA AACAAATTGA TTTACTTCAG 3540 
AGCAAATATG ATCCTATTAA AATAATATAG GGTAAATACC CTACCTCTTA GAAAGGGCAA 3600 
AAATGCAAAG AAGCTTTCTT TAAAACTAAA AGGGTTTTTT GGGGGGGGAG TTGGCGGGGA 3660 
GG A A ATA AGG CTAACAGAGG TTGACCTAAA ATTAGCCTTA CAAAGGAGAA AGGACCACAT 3720 
TGCTTACTTG AAACAGACAA TGAAAACAAC CAAAGTGATA TATAAAATAG TTGATGAGAA 3780 
CTAGACTTAT G ACTGTAGTT TACTAGAGTT TAGTTTTCAG TTGCTGAAGT AGCTCATTTT 3840 
CTCTTACTAA TGTTTGGTTC CTCAGGGAAG AATCTCACTT GACTAGAGAG GAGGTGGGAA 3900 
CAGAAGAGAG AAGGAGGCAG GG AG ATGTAT TTCTTAGGGC TC ACCCCTTC ACAGACTGAC 3960 
AGAATGGTTT TGTTTTGTTT TGTTTTGTTT TGTTTTGTTT TTGAGATGGA CTCTAGCTCT 4020 
GTCACCCAGG CTGGAGTGCA GTGGTGCGAT CTCGGCTCAC TGCAAGCTCC GCCTCCCGGG 4080 
TTCTCACCAT TCTCCTGCCT CAGCCTCCCG AGTAGCTGGG ACTACAGGCG CCCACCACCA 4140 
CGCCCGGCTA ATTTTTTGTA TTTTTTAGTA GAG ACGGGGT TTCACCATGT TAGCCAGGAT 4200 
GGTCTCGATC TCCTGACCTC GTGATCCGCC CGCCTCGGCC TCCCAAAGTG CTGGGATTAC 4260 
AGGCGTG AGC C ACCGTGCCT GCCCCAGAAT GGTTTTTAAA GCCACAGTTG AGAGGCCACC 4320 
CATTGCCCGG CGCCTGG ACA GTG ATCATCT TGTTCATCTT GTTCAGTCCT TTCTTGTGTG 4380 
ATTGGAATTA TTCATCCCCT TTGAAAGATG AG A AGGTTG A GATGCAAAGA GTCTACCTTT 4440 
CCAAGTTCTC ACTGCTGGAA AGAGCTAGAA GCACAGTTCA AAGTTCTGGC TTCTGGACTC 4500 
TGCAGTCCAG GTCTCCCTTC TCCCACTTGC CTACCCTCAA TGCCACACTG TTTTTGAAGT 4560 
GGCCCATAAC TTG A AGGAAA AGTTTAA AG A C AGTTCAATT TAATCATCAG AATGC ATTCT 4620 
TTTTTTTTTC GGAGACGG AG TTTCACTCTT GCTGCCCAGG CTGGAGTGCA ATGGTGCAAT 4680 
GATCTCGGCT CACTGCAACC TCTGCCTCCT GGGTTCAAGT GATTCTCCAG CCTCAGCCTC 4740 
CCGAGTAGCT GGGATTATGG GCGCCCACCA CCATGCCCAG CTAATTTTTG TATTTTTTTT 4800 
TTTTAGTAGA GATGGGGTTT CGCCAGGTTG GCCAGGCTGG TCTTGTGAAC TCCTGGCCTC 4860 
AGGTGATCTG CCCACCTCAT CCTCCAAAAG TGCTGGG ATT ACAGGCATGA GCCACTGCGC 4920 
CTGGCCTCAG AATGCATTCT TACACATCTA TCCTAGACAT TTATAAGCAC TCTAATGGAT 4980 
AACAATCCAA GAATAAATGA TTGTAAAAGA TGATGCCGAA GAGTTGATGT CAATCTTTTT 5040 
TTCCTAAGAA AAAAAGTCCG CGAGTATTAA ATATTTAGAT CAATGTTTAT AAAATGATTA 5100 
CTTTGTATAT CTCATTATTC CTATTTTGGA ATAAAAACTG ACCTTCTTTA ATCATATACT 5160 
TGTCTTTTGT AAATAGCAGC TTTTGTGTCA TTCTCCCCAC TTTATTAGTT AATTTAAATT 5220 
GGAAAAAACC CTCAAACTAA TATTCTTGTC TGTTCCAGTC TTATAAATAA AACTTATAAT 5280 
GCATG 



SEQ ID NO:154 PFD6 Protein sequence: 
Protein Accession #: NP_055483.1 

1 11 21 31 41 51 
I I I i [ I 

MWQKIEDVEW RFQTYLELEG LPCILIFSGM DPHGESLPRS LRYCDLRL1N SSCLVRTALE 60 
QELGLAAYFV SNEVPLEKGA RNEALESDAE KLSSTDNEDE ELGTEGSTSE KRSPMKRERS 120 
RSHDSASSSL SSKASGSALG GESSAQPTAL PQGEHARSPQ PRGPAEEGRA PGEKQRPRAS 180 
QGPPSAISRH SPGPTPQPDC SLRTGQRSVQ VSVTSSCSQL SSSSGSSSSS VAPAAGTWVL 240 
QASQCSLTKA CRQPP1VFLP KLVYDMVVST DSSGLPKAAS LLPSPSVMWA SSFRPLLSKT 300 
MTSTEQSLYY RQWTVPRPSH MDYGNRAEGR VDGFHPRRLL LSGPPQIGKT GAYLQFLSVL 360 
SRMLVRLTEV DVYDEEEINI NLREESDWHY LQLSDPWPDL ELFKKLPFDY HHDPKYED A 420 
SLICSHYQGI KSEDRGMSRK PEDLYVRRQT ARMRLSKYAA YNTYHHCEQC HQYMGFHPRY 480 
QLYESTLHAF AFSYSMLGEE IQLHFIWKS KEHHFVFSQP GGQLESMRLP LVTDKSHEYI 540 
KSPTFTPTTG RHEHGLFNLY HAMDGASHLH VLVVKEYEMA IYKKYWPNHI MLVLPSIFNS 600 
AGVGAAHFLI KELSYHNLEL ERNRQEELGI KPQDIWPFIV ISDDSCVMWN WDVNSAGER 660 
SREESWSERN VSLKH1MQHI EAAPDIMHYA LLGLRKWSSK TRASEVQEPF SRCHVHNF1I 720 
LNVDLTQNVQ YNQNRFLCDD VDFNLRVHSA GLLLCRFNRF SVMKKQIWG GHRSFHITSK 780 
VSDNSAAVVP AQYICAPDSK HTFLAAPAQL LLEKFLQHHS HLFFPLSLKN HDHPVLSVDC 840 
YLNLGSQISV CYVSSRPHSL NISCSDLLFS GLLLYLCDSF VGASFLKKFH FLKGATLC VI 900 
CQDRSSLRQT VVRLELEDEW QFRLRDEFQT ANAREDRPLF FLTGRHI 



SEQ ID NO:155 PFC6 DNA SEQUENCE 

Nucleic Acid Accession #: NM_000522 

Coding sequence: 1-1 1 67 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I 1 I ! I I 

ATGAC AGCCT CCGTGCTCCT CCACCCCCGC TGG ATCGAGC CCACCGTCAT GTTTCTCTAC 60 
GACAACGGCG GCGGCCTGGT GGCCG ACG AG CTCAAC A AG A ACATGGAAGG GGCGGCGGCG 1 20 
GCTGCAGCAG CGGCTGCAGC GGCGGCGGCT GCCGGGGCCG GGGGCGGGGG CTTCCCCCAC 180 
CCGGCGGCTG CGGCGGCAGG GGGCAACTTC TCGGTGGCGG CCGCGGCCGC GGCTGCGGCG 240 
GCCGCCGCGG CCAACCAGTG CCGCAACCTG ATGGCGCACC CGGCGCCCTT GGCGCCAGGA 300 
GCCGCGTCCG CCTACAGCAG CGCCCCCGGG GAGGCGCCCC CGTCGGCTGC CGCCGCTGCT 360 
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GCCGCGGCTG CCGCTGCAGC CGCCGCCGCC GCCGCCGCGT CGTCCTCGGG AGGTCCCGGC 420 
CCGGCGGGCC CGGCGGCGGC AGAGGCGGCC AAGCAATGCA GCCCCTGCTC GGCAGCGGCG 480 
CAGAGCTCGT CGGGGCCCGC GGCGCTGCCC TATGGCTACT TCGGCAGCGG CTACTACCCG 540 
TGCGCCCGCA TGGGCCCGCC CCCCAACGCC ATCAAGTCGT GCCCCCAGCC CCCCTCGGCC 600 
5 GCCGCCGCCG CCGCCTTCGC GGACAAGTAC ATGGATACCG CCGGCCCAGC TGCCGAGGAG 660 
TTCAGCTCCC GCGCTAAGGA GTTCGCGTTC TACCACCAGG GCTACGCAGC CGGGCCTTAC 720 
CACCACCATC AGCCCATGCC TGGCTACCTG GATATGCCAG TGGTGCCGGG CCTCGGGGGC 780 
CCCGGCGAGT CGCGCCACGA ACCCTTGGGT CTTCCCATGG AAAGCTACCA GCCCTGGGCG 840 
CTGCCCAACG GCTGG AACGG CCA AATGTAC TGCCCCAAAG AGCAGGCGCA GCCTCCCCAC 900 
1 0 CTCTGGAAGT CC ACTCTGCC CG ACGTGGTC TCCC ATCCCT CGGATGCCAG CTCCTATAGG 960 

AGGGGGAGAA AGAAGCGCGT GCCTTATACC AAGGTGCAAT TAAAAGAACT TGAACGGGAA 1020 
TACGCCACGA ATAAATTCAT TACTAAGGAC AAACGGAGGC GGATATCAGC CACGACG AAT 1080 
CTCTCTGAGC GGC AGGTCAC AATCTGGTTC CAGAACAGGA GGGTTAAAGA GAAAAAAGTC 1 140 
ATCAACAAAC TGAAAACCAC TAG TTAA 



15 



30 



35 



SEQ ID N0:1S6 PFC6 Protein sequence: 
Prolan Accession #: NPJJ0051 3.1 



20 1 II 21 31 41 51 
I I I I I I 

MTASVLLHPR WIEPTVMFLY DNGGGLVADE LNKNMEGAAA AAAAAAAAAA AGAGGGGFPH 60 
PAAAAAGGNF SVAAAAAAAA AAAANQCRNL MAHPAPLAPG AASAYSSAPG EAPPS AAAAA 120 
AAAAAAAAAA A AASSSGGPG PAGPAA AEAA KQCSPCS AAA QSSSGPA ALP YG YFGSGYYP 1 80 
25 CARMGPPPNA IKSCPQPPSA AAAAAFADKY MDTAGPAAEE FSSRAKEFAF YHQGYAAGPY 240 

HHHQPMPGYL DMPVVPGLGG PGESRHEPLG LPMESYQPWA LPNGWNGQMY CPKEQ AQPPH 300 
LWKSTLPDV V SHPSDASSYR RGRKKRVPYT KVQLKELERE YATNKFITKD KRRRISATTN 360 
LSERQVTIWF QNRRVKEKKV INKLKTTS 



SEQ ID NO:157 PFA3 DNA SEQUENCE 

Nucleic Acid Accession #: AW1 02723 

Coding sequence: 523-2676 (underlined sequences correspond to start and stop codons) 

I 11 21 31 41 51 



CCCTTATGGC GATTGGGCGG CTGCAGAGAC CAGGACTCAG TTCCCCTGCC CTAGTCTGAG 60 
CCTAGTGGGT GGGACTCAGC TCAGAGTCAG TTTTCAGAAG CAGGTTTCAG TTGCAGAGTT 120 

40 TTCCTACACT TTTCCTGCGC TAGAGCAGCG AGCAGCCTGG AACAGACCCA GGCGGAGGAC 180 
ACCTGTGGGG GAGGGAGCGC CTGGAGGAGC TTAGAGACCC CAGCCGGGCG TGATCTCACC 240 
ATGTGCGGAT TTGCGAGGCG CGCCCTGGAG CTGCTAG AGA TCCGGAAGCA CAGCCCCGAG 300 
GTGTGCGAAG CCACCAAGAC TGCGGCTCTT GGAGAAAGCG TGAGCAGGGG GCCACCGCGG 360 
TCTCCGGCCT GTCTGCACCC TGTCGCCTGA GCTGCCTGAC AGTGACAATG ACATCCCAGT 420 

45 TACCAGTGTC CTTGAATTGA TAGTGGCTTC TGTTTGTCAG TCTCATATAA GAACTACAGC 480 

TCATCAGGAG GAGATCGCAG CAGGGTAAGA GACACCAACA CCATGTTCTG CACGAAGCTC 540 
AAGGATCTCA AGATC ACAGG AGAGTGTCCT TTCTCCTTAC TGGCACCAGG TCAAGTTCCT 600 
AACGAGTCTT CAGAGGAGGC AGCAGGAAGC TCAGAGAGCT GCAAAGCAAC CGTGCCCATC 660 
TGTCAAGACA TTCCTGAG AA GAACATACAA GAAAGTCTTC CTCAAAGAAA AACCAGTCGG 720 

50 AGCCGAGTCT ATCTTCACAC TTTGGCAGAG AGTATTTGCA AACTGATTTT CCCAGAGTTT 780 

GAACGGCTG A ATGTTGCACT TCAGAGAACA TTGGC AAAGC ACAAAATAAA AG AAAGCAGG 840 
AAATCTTTGG AAAGAGAAGA CTTTG AAAAA ACAATTGCAG AGCAAGCAGT GCAGCAGAGT 900 
CCAGTGGAGT TATCAAAG AA TCTCTTGGTG AAGAGGTTTT TAAAATATGT TACGAGGAAG 960 
ATGAAAACAT CCTTGGGGTG GTTGG AGGC A CCCTTAAAGA TTTTTAAACA GCTTCAGTAC 1020 

55 CCTTCTGAAA CAGAGCAGCC ATTGCCAAGA AGCAGGAAAA AGGGGCAGCT TGAGGACGCC 1080 
TCCATTCTAT GCCTGGATAA GGAGGATGAT TTTCTACATG TTTACTACTT CTTCCCTAAG 1 140 
AGAACCACCT CCCTGATTCT TCCCGGCATC ATAAAGGCAG CTGCTCACGT ATTATATGAA 1200 
ACGGAAGTGG AAGTGTCGTT AATGCCTCCC TGCTTCCATA ATGATTGCAG CGAGTTTGTG 1260 
AATCAGCCCT ACTTGTTGTA CTCCGTTCAC ATG AAAAGCA CCAAGCCATC CCTGTCCCCC 1 320 

60 AGCAAACCCC AGTCCTCGCT GGTGATTCCC ACATCGCTAT TCTGCAAGAC ATTTCCATTC 1380 
CATTTCATGT TTG AC A A AG A TATGACAATT CTGCAATTTG GCAATGGCAT CAGAAGGCTG 1440 
ATGAACAGGA GAGACTTTCA AGGAAAGCCT AATTTTGAAT ACTTTGAAAT TCTGACTCCA 1500 
AAAATCAACC AGACCTTTAG CGGGATCATG ACTATGTTGA ATATGCAGTT TGTTGTACGA 1560 
GTGAGGAGAT GGGACAACTC TGTGAAGAAA TCTTCAAGGG TTATGG ACCT CAAAGGCCAA 1620 

65 ATGATCTACA TTGTTGAATC CAGTGCAATC TTGTTTTTGG GGTCACCCTG TGTGGACAGA 1680 
TTAGAAGATT TTACAGGACG AGGGCTCTAC CTCTCAGACA TCCCAATTCA CAATGCACTG 1740 
AGGGATGTGG TCTTAATAGG GGAACAAGCC CGAGCTCAAG ATGGCCTGAA GAAGAGGCTG 1800 
GGG A AGCTG A AGGCTACCCT TGAGCAAGCC CACCAAGCCC TGGAGGAGGA GAAGAAAAAG I860 
ACAGTAGACC TTCTGTGCTC CATATTTCCC TGTGAGGTTG CTCAGCAGCT GTGGCAAGGG 1920 

70 CAAGTTGTGC AAGCCAAGAA GTTCAGTAAT GTCACCATGC TCTTCTCAGA CATCGTTGGG 1980 
TTCACTGCCA TCTGCTCCC A GTGCTCACCG CTGC AGGTC A TC ACCATGCT CAATGCACTG 2040 
TACACTCGCT TCGACCAGCA GTGTGGAGAG CTGGATGTCT ACAAGGTGGA GACCATTGCG 2100 
ATGCCTATTG TGTGGCTTGG GGGATTACAC AAAGAGAGTG ATACTCATGC TGTTCAGATA 2160 
GCGCTG ATGG CCCTGA AG AT G ATGG AGCTC TCTG ATG AAG TTATGTCTCC CCATGGAGAA 2220 

75 CCTATCAAGA TGCGAATTGG ACTGCACTCT GGATCAGTTT TTGCTGGCGT CGTTGGAGTT 2280 
AAAATGCCCC GTTACTGTCT TTTTGGAAAC AATGTCACTC TGGCTAACAA ATTTGAGTCC 2340 
TGC AGTGTAC CACGAAAAAT CAATGTCAGC CCAACA ACTT ACAGATTACT CAAAGACTGT 2400 
CCTGGTTTCG TGTTTACCCC TCGATCAAGG GAGGAACTTC C ACC AA ACTT CCCTAGTGAA 2460 
ATCCCCGG AA TCTGCC ATTT TCTGG ATGCT TACCAACA AG G AACAAACTC AAAACCATGC 2520 
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TTCCAAAAGA AAGATGTGG A AGATGCAAGC CAATTTTTTA GGCAAAGCAT CAGGAATAGA 2580 
TTAGCAACCT AT AT ACCTAT TTATAAGTCT TTGGGGTTTG ACTCATTGAA GATGTGTAGA 2640 
GCCTCTGAAA GCACTTTAGG G ATTGTAGAT GGCTAACAAG CAGTATTAAA ATTTCAGGAG 2700 
CCAAGTCACA ATCTTTCTCC TGTTTAACAT GACAAAATGT ACTCACTTCA GTACTTCAGC 2760 
TCTTCAAGAA AAAAAAAAAA ACCTTAAAAA GCTACTTTTG TGGGAGTATT TCTATTATAT 2820 
AACCAGCACT TACTACCTGT ACTCAAAATT CAGCACCTTG TACATATATC AGATAATTGT 2880 
AGTCAATTGT ACAAACTGAT GG AGTCACCT GCAATCTCAT ATCCTGGTGG AATGCCATGG 2940 
TTATTAAAGT GTGTTTGTGA TAGTTGTCGT CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 3000 
AAAA 



SEQ ID NO:158 PFA3 Protein seouence: 
Protein Accession #: NP_000847.1 



1 11 21 31 41 5i 
I I I I I I 

MFCTKLKDLK ITGECPFSLL APGQVPNESS EEAAGSSESC KATVPICQDI PEKNIQESLP 60 
QRKTSRSRVY LHTLAESICK L1FPEFERLN VALQRTLAKH KIKESRKSLE REDFEKTIAE 120 
QAVQQSPVEL SKNLLVKRFL KY VTRKMKTS LGWLEAPLKI FKQLQYPSET EQPLPRSRKK 1 80 
GQLEDASILC LDKEDDFLHV YYFFPKRTTS LILPGIIKAA AHVLYETEVE VSLMPPCFHN 240 
DCSEFVNQPY LLYS VHMKST KPSLSPSKPQ SSL VIPTSLF CKTFPFHFMF DKDMTILQFG 300 
NGIRRLMNRR DFQGKPNFEY FEILTPKINQ TFSGIMTMLN MQFVVRVRRW DNSVKKSSRV 360 
MDLKGQMIYI VESSA1LFLG SPCVDRLEDF TGRGLYLSDI PIHNALRDVV LIGEQARAQD 420 
GLKKRLGKLK ATLEQAHQAL EEEKKKTVDL LCSIFPCEVA QQLWQGQVVQ AKKFSNVTML 480 
FSDIVGFTAI CSQCSPLQVI TMLNALYTRF DQQCGELDVY KVETIAMPIV WLGGLHKESD 540 
THAVQIALMA LKMMELSDEV MSPHGEPIKM RIGLHSGSVF AGVVGVKMPR YCLFGNNVTL 600 
ANKFESCS VP RK1NVSPTTY RLLKDCPGFV FTPRSREELP PNFPSEIPGI CHFLDAYQQG 660 
TNSKPCFQKK DVEDASQFFR QSIRNRLATY IPIYKSLGFD SLKMCRASES TLGIVDG 



SEQ ID NO:159 PFA1 DNA SEQUENCE 

Nucleic Acid Accession #: NMJ304362 

Coding sequence: 102-1 934 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 
I I i I I I 

CGCCGGCGGG ACTGGTCTGA AGAGACGCGG GGACAAAGTG GCAACGACTT GGACATCTGA 60 
GCTGTCACTG CCG AAAACAG GCCGCAAGAG AGATAATCAA TATGCATTTC CAAGCCTTTT 120 
GGCTATGTTT GGGTCTTCTG TTCATCTCAA TTAATGCAGA ATTTATGGAT GATGATGTTG 180 
AGACGGAAGA CTTTGAAGAA AATTCAGAAG AAATTGATGT TAATGAAAGT GAACTTTCCT 240 
CAGAGATTAA ATATAAGACA CCTCAACCTA TAGGAGAAGT ATATTTTGCA GAAACTTTTG 300 
ATAGTGGAAG GTTGGCTGGA TGGGTCTTAT CAAAAGCAAA GAAAGATGAC ATGGATGAGG 360 
AAATTTCAAT ATACGATGGA AG ATGGGAAA TTGAAGAGTT GAAAGAAAAC CAGGTACCTG 420 
GTGACAGAGG ACTGGTATTA AAATCTAG AG CAAAGCATCA TGCAATATCT GCTGTATTAG 480 
CAAAACCATT CATTTTTGCT GATAAACCCT TGATAGTTCA ATATGAAGTA AATTTTCAAG 540 
ATGGTATTG A TTGTGG AGGT GCATACATTA AACTCCTAGC AGACACTGAT GATTTGATTC 600 
TGGAAAACTT TTATGATAAA ACATCCTATA TCATTATGTT TGG ACCAGAT AAATGTGGAG 660 
AAGATTATAA ACTTCATTTT ATCTTCAGAC ATAAACATCC CAAAACTGGA GTTTTCGAAG 720 
AG AAACATGC CAAACCTCCA GATGTAGACC TTAAAAAGTT CTTTACAGAC AGGAAGACTC 780 
ATCTTTATAC CCTTGTGATG AATCCAGATG ACACATTTGA GGTGTTAGTT GATCAAACAG 840 
TTGTAAACAA AGGAAGCCTC CTAGAGGATG TGGTTCCTCC TATCAAACCT CCCAAAG AAA 900 
TTGAAGATCC CAATGATAAA AAACCTGAGG AATGGGATGA AAGAGCAAAA ATTCCTGATC 960 
CTTCTGCCGT CAAACCAGAA GACTGGGATG AAAGTGAACC TGCCCAAATA GAAGATTCAA 1020 
GTGTTGTTAA ACCTGCTGGC TGGCTTGATG ATGAACCAAA ATTTATCCCT GATCCTAATG 1080 
CTGAAAAACC TG ATGACTGG AATGAAG ACA CGG ATGG AG A ATGGG AGGCA CCTCAGATTC 1 140 
TTAATCCAGC ATGTCGGATT GGGTGTGGTG AGTGGAAACC TCCCATGATA GATAACCCAA 1200 
AATACAAAGG AGT ATGG AG A CCTCCACTGG TCGATAATCC TAACTATCAG GGAATCTGGA 1260 
GTCCTCGAAA AATTCCTAAT CCAGATTATT TCGAAGATGA TCATCCATTT CTTCTGACTT 1320 
CTTTC AGTGC TCTTGGTTTA G AGCTTTGGT CTATG ACCTC TGATATCTAC TTTGATAATT 1 380 
TTATTATCTG TTCGGAAAAG GAAGTAGCAG ATCACTGGGC TGCAGATGGT TGGAGATGGA 1440 
AAATAATGAT AGCAAATGCT AATAAGCCTG GTGTATTAAA ACAGTTAATG GCAGCTGCTG 1500 
AAGGGCACCC ATGGCTTTGG TTGATTTATC TTGTGACAGC AGGAGTGCCA ATAGCATTAA 1560 
TTACTTCATT TTGTTGGCCA AGAAAAGTAA AGAAAAAACA TAAAGATACA GAGTATAAAA 1620 
AAACCGACAT ATGTATACCA CAAACAAAAG GAGTACTAGA GCAAGAAGAA AAGGAAGAGA 1680 
AAGCAGCCCT GG A AAAACCA ATGGACCTGG AAGAGGAAAA AAAGCAAAAT GATGGTGAAA 1740 
TGCTTGAAAA AGAAGAGGAA AGTGAACCTG AGGAAAAGAG TG A AG A AG A A ATTGAAATCA 1800 
TAGAAGGGCA AG A AG A A AGT AATCAATCAA ATAAGTCTGG GTCAGAGGAT GAGATGAAAG 1860 
AAGCAGATGA GAGCACAGGA TCTGGAGATG GGCCGATAAA GTCAGTACGC AAAAGAAGAG 1920 
TACGAAAGGA CTAAACTAGA TTG AAATATT TTTAATTCCC GAGAGGATGT TTGGCATTGT 1980 
AAAAATCAGC ATGCCAGACC TGAACTTTAA TCAGTCTGCA CATCCTGTTT CTAATATCTA 2040 
GCAACATTAT ATTCTTTCAG ACATTTATTT TA GTCCTTC A TTTCCGAGGA AAAAGAAGCA 2100 
ACTTTGAAGT TACCTCATCT TTGAATTTAG AATAAAAGTG GCACATTACA TATCGGATCT 2160 
A AG AG ATTAA TACCATTAGA AGTTACACAG TTTTAGTTGT TTGG AG ATAG TTTTGGTTTG 2220 
TACAGAACAA AATAATATGT AGCAGCTTCA TTGCTATTGG AAAAATCAGT TATTGGAATT 2280 
TCCACTTAAA TGGCTATACA ACAATATAAC TGGTAGTTCT ATAATAAAAA TGAGCATATG 2340 
TTCTGTTGTG AAG AGCTA A A TGCAATAAAG TTTCTGTATG GTTGTTTGAT TCTATCAACA 2400 
ATTGAAAGTG TTGTATATGA CCCACATTTA CCTAGTTTGT GTCAAATTAT AGTTACAGTG 2460 
AGTTGTTTGC TTAAATTATA GATTCCTTTA AGGACATGCC TTGTTCATAA AATCACTGGA 2520 
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TTATATTGCA GCATATTTTA CATTTGAATA CAAGGATAAT GGGTTTTATC AAAACAAAAT 2580 
GATGTACAGA TTTTTTTTCA AGTTTTTATA GTTGCTTTAT GCCAGAGTGG TTTACCCCAT 2640 
TCACAAAATT TCTTATGCAT ACATTGCTAT TGAAAATAAA ATTTAAATAT TTTTTCATCC 2700 
TGAAAAAAAA 



SEQ ID NO:160 PFA1 Protein sequence: 
Protein Accession #: NP_004353.1 

10 I II 21 3i 41 51 

I i 1 I I 1 

MHFQAFWLCL GLLHSINAE FMDDD V ETED FEENSEEEDV NESELSSEIK VKTPQPIGEV 60 
YFAETFDSGR LAGWVLSKAK KDDMDEEISI YDGRWEIEEL KENQVPGDRG LVLKSRAKHH 120 
AISAVUVKPFIFADKPUVQYEVNFQIXjIDCGGAYIKLLADTDDLILENFYDKTSYUMF 180 

1 5 GPDKCGEDYK LHFIFRHKHP KTGVFEEKHA KPPDVDLKKF FTDRKTHLYT LVMNPDDTFE 240 
VLVDQTVVNK GSLLEDVVPP IKPPKEIEDP NDKKPEEWDE RAKIPDPSAV KPEDWDESEP 300 
AQIEDSS V VK PAGWLDDEPK FIPDPNAEKP DDWNEDTDGE WEAPQILNPA CRIGCGEWKP 360 
PMIDNPKYKG VWRPPLVDNP NYQGIWSPRK IPNPDYFEDD HPFLLTSFSA LGLELWSMTS 420 
DIYFDNFIIC SEKEVADHWA ADGWRWKIMI ANANKPGVLK QLMAAAEGHP WLWLIYLVTA 480 

20 GVPIALITSF CWPRKVKKKH KDTEYKKTDI CIPQTKGVLE QEEKEEKAAL EKPMDLEEEK 540 
KQNDGEMLEK EEESEPEEKS EEEIEIIEGQ EESNQSNKSG SEDEMKEADE STGSGDGPIK 600 
SVRKRRVRKD 



25 SEQ ID N0:161 PEZ9 DNA SEQUENCE 

Nucleic Acid Accession #: NMJJ05932 

Coding sequence: 75-221 6 {underlined sequences correspond to start and stop codons) 

„ 1 11 21 31 41 51 

30 j | ] | | J 

GCGGAGCGCG CGCTCCCAGC G AAAGCAGCA GGGCAGGG AT CTGCGTTGGA GGAAGGGACT 60 
GCTCTGGTGC TAGAATGCTG TGCGTCGGAA GGCTGGGCGG CTTGGGAGCC AGAGCAGCAG 120 
CTCTGCCGCC CCGCCGGGCG GGCCGGGGAA GCCTCG AAGC CGGGATCCGG GCCCGAAGGG 1 80 
TCAGCACCAG CTGGTCTCCC GTGGGCGCCG CCTTCAATGT CAAGCCCCAG GGCAGCCGCT 240 

3 5 TGG ACCTGTT CGGCG AGCGG GCGCGTCTTT TTGGAGTTCC TGAGCTGAGT GCCCCAG AAG 300 
GATTTCATAT TGCACAAGAA AAAGCCTTGA GAAAGACAGA ATTGCTTGTG GACCGTGCAT 360 
GTTCCACCCC ACCTGGGCCC CAGACCGTGC TGATCTTCGA TGAGCTCTCG GATTCCTTAT 420 
GCAGAGTGGC CGACTTGGCT GATTTTGTGA AAATCGCTCA CCCTGAGCCA GCATTCAGAG 480 
AAGCTGCGGA AGAAGCTTGT AGAAGTATTG GCACCATGGT AGAGAAGTTG AACACAAATG 540 

40 TGGATTTATA TCAAAGTTTG CAAAAATTAC TAGCTGATAA AAAACTTGTG GATTCCCTTG 600 
ATCC AG A A AC AAGGCGAGTG GCTGAACTGT TTATGTTTGA TTTTGAAATT AGTGGAATCC 660 
ATCTAGACAA ACAAAAGCGT AAAAG AGCAG TGG ACCTCAA TGTTAAAATC TTGG ATTTGA 720 
GTAGTACATT TCTTATGGGA ACCAATTTTC CCAACAAGAT TG AGAAGCAT CTCTTACCAG 780 
AACACATTCG TCGTAACTTT ACATCTGCTG GGGATCATAT CATAATTGAT GGTCTCCACG 840 

45 CAGAATCACC AGATGACTTG GTGCGAGAAG CTGCTTATAA AATTTTTCTT TATCCCAATG 900 
CTGGTCAATT G AAATGTTTA G AAGAATTGC TCAGCACCAG AGATCTTCTG GCAAAGTTGG 960 
TGGGGTATTC CACGTTTTCT CACAGGGCTC TCCAAGGAAC GATAGCTAAA AATCC AG AGA 1020 
CTGTCATGCA GTTCCTTGAA AAACTATCTG ACAAACTTTC TGAAAGAACT CTGAAAGATT 1080 
TTGAGATGAT ACGAGGGATG AAAATGAAAC TGAATGCTCA AAATTCCGAA GTAATGCCCT 1 140 

50 GGGACCCCCC TTACTACAGT GGTGTG ATTC GTGCAG AAAG GTATAATATT GAGCCCAGCC 1 200 
TATATTGCCC GTTTTTCTCT CTTGGAGCAT GCATGGAAGG CCTGAATATT TTGCTTAACA 1260 
GACTGTTGGG GATTTCATTA TATGCAGAGC AGCCTGCAAA AGGAGAGGTG TGGAGCGAAG 1320 
ATGTCCGAAA ACTGGCTGTT GTTCATGAAT CTGAAGGATT GTTGGGGTAC ATTTACTGTG 1380 
ATTTTTTTCA GCG AGCAG AC AAACC AC ATC AGGATTGCCA TTTCACTATC CGTGGAGGC A 1440 

5 5 GACTAAAGGA AG ATGG AG AC TATCAACTCC CACTTGTAGT TCTTATGCTG AATCTTCCCC 1500 
GTTCCTCAAG GAGTTCTCCA ACTTTGCTAA CTCCTGGCAT GATGGAAAAT CTTTTCCATG 1560 
AAATGGGACA TGCCATGCAT TCAATGCTAG GACGTACTCG TTACCAACAC GTCACTGGGA 1620 
CCAGGTGCCC TACTGATTTT GCTGAGGTTC CTTCTATTCT GATGGAGTAC TTTGCAAATG 1680 
ATTATCGAGT AGTTAACCAA TTTGCCAGAC ATTATCAGAC TGGACAGCCA CTGCCAAAAA 1740 

60 ATATGGTGTC TCGTCTTTGT G AATCTA A AA AGGTTTGTGC TGCAGCTGAT ATGCAACTTC 1 800 
AGGTCTTTTA TGCCACTCTG GATCAAATCT ACCATGGGAA GCATCCCCTG AGGAATTCAA 1860 
CCAC AG AC AT TCTCAAGG AA ACACAAGAG A AATTCTATGG CCTACCATAT GTTCCAAATA 1920 
CTGCCTGGCA GCTGCGATTC AGCCACCTCG TGGGGTATGG TGCTAGATAT TACTCTTACC 1980 
TCATGTCCAG AGCGGTCGCC TCCATGGTTT GG AAGGAGTG TTTTCTACAG GATCCTTTCA 2040 

65 ACAGGGCTGC CGGGGAGCGC TATCGCAGGG AGATGCTGGC CCACGGTGGA GGCAGGGAGC 2100 
CCATGCTCAT GGTTGAAGGT ATGCTTCAGA AGTGTCCTTC TGTTG ATGAC TTCGTAAGTG 2160 
CCCTCGTTTC CGACTTGGAT CTGGACTTCG AAACTTTCCT CATGGATTCT GAATAAAAGA 2220 
AACACTCTAC ACCTCTAATC AAGGTCATGT AGTAATGACT TTGTTATAAA TGCTACAGCT 2280 
GTGAGAGCTT GTTTCTGATT GTTTCATTGT TCGCTTCTGT AATTCTGAAA AACTTTAAAC 2340 

70 TGGTAGAACT TGGAATAAAT AATTTGTTTT AATTA AAAAA AAAAAAAAAA AA 



SEQ ID NO:162 PEZ9 Protein seouence: 
Protein Accession #: NPJJ05923. 1 

1 U 21 31 41 51 
I I I I I I 

MLCVGRLGGL GARAAALPPR RAGRGSLEAG IRARRVSTSW SPVGAAFNVK PQGSRLDLFG 60 
ERARLFGVPE LSAPEGFHIA QEKALRKTEL LVDRACSTPP GPQTVLIFDE LSDSLCRVAD 120 
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LADFVK1AHP EPAFREAAEE ACRSIGTMVE KLNTNVDLYQ SLQKLLADKK LVDSLDPETR 180 
RVAELFMFDF EISGIHLDKQ KRKRAVDLNV KILDLSSTFL MGTNFPNKIE KHLLPEHIRR 240 
NFTSAGDHU IDGLHAESPD DLVREAAYKI FLYPNAGQLK CLEELLSSRD LLAKLVGYST 300 
FSHRALQGTI AKNPETVMQF LEKLSDKLSE RTLKDFEM1R GMKMKLNAQN SEVMPWDPPY 360 
YSGVIRAERY NIEPSLYCPF FSLGACMEGL NILLNRLLGI SLYAEQPAKG EVWSEDVRKL 420 
AVVHESEGLL GYIYCDFFQR ADKPHQDCHF TIRGGRLKED GDYQLPLVVL MLNLPRSSRS 480 
SPTLLTPGMM ENLFHEMGHA MHSMLGRTRY QHVTGTRCPT DFAEVPS1LM EYFANDYRVV 540 
NQFARHYQTG QPLPKNMVSR LCESKKVCAA ADMQLQVFYA TLDQIYHGKH PLRNSTTDIL 600 
KETQEKFYGL PYVPNTAWQL RFSHLVGYGA RYYSYLMSRA VASMVWKECF LQDPFNRAAG 660 
ERYRREMLAH GGGREPMLMV EGMLQKCPSV DDFVSALVSD LDLDFETFLM DSE 



SEQ 10 NO:163 PEZ8 DNA SEQUENCE 

Nucleic Acrd Accession*; AF103907 

Coding sequence: none (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 
I I I I I ( 

ACAGAAGAAA TAGCAAGTGC CGAGAAGCTG GCATCAGAAA AACAGAGGGG AGATTTGTGT 60 
GGCTGCAGCC GAGGGAGACC AGGAAGATCT GCATGGTGGG AAGGACCTGA TGATACAGAG 120 
GAATTACAAC ACATATACTT AGTGTTTCAA TGAACACCAA GATAAATAAG TGAAGAGCTA 180 
GTCCGCTGTG AGTCTCCTCA GTGACACAGG GCTGGATCAC CATCGACGGC ACTTTCTGAG 240 
TACTCAGTGC AGCAAAG AAA G ACT AC AG AC ATCTCAATGG CAGGGGTGAG AAATAAGAAA 300 
GGCTGCTG AC TTTACCATCT GAGGCCACAC ATCTGCTGAA ATGGAG ATAA TTAACATC AC 360 
TAGAAACAGC AAGATGACAA TATAATGTCT AAGTAGTGAC ATGTTTTTGC ACATTTCCAG 420 
CCCCTTTAAA TATCCACACA CACAGGAAGC ACAAAAGGAA GCACAGAG AT CCCTGGGAGA 480 
AATGCCCGGC CGCCATCTTG GGTCATCGAT GAGCCTCGCC CTGTGCCTGG TCCCGCTTGT 540 
GAGGGAAGGA CATTAGAAAA TGAATTGATG TGTTCCTTAA AGGATGGGCA GGAAAACAGA 600 
TCCTGTTGTG GATATTTATT TGAACGGGAT TACAGATTTG AAATGAAGTC ACAAAGTGAG 660 
CATTACCAAT GAGAGGAAAA CAGACGAGAA AATCTTGATG GCTTCACAAG ACATGCAACA 720 
AACAAAATGG AATACTGTGA TGACATGAGG CAGCCAAGCT GGGGAGGAGA TAACCACGGG 780 
GCAGAGGGTC AGGATTCTGG CCCTGCTGCC TAAACTGTGC GTTCATAACC AAATCATTTC 840 
ATATTTCTAA CCCTCAAAAC AAAGCTGTTG TAATATCTGA TCTCTACGGT TCCTTCTGGG 900 
CCCAACATTC TCCATATATC CAGCCACACT CATTTTTAAT ATTTAGTTCC CAG ATCTGTA 960 
CTGTG ACCTT TCTACACTGT AGAATAACAT TACTCATTTT GTTCAAAGAC CCTTCGTGTT 1020 
GCTGCCTAAT ATGTAGCTGA CTGTTTTTCC TAAGGAGTGT TCTGGCCCAG GGGATCTGTG 1080 
A ACAGGCTGG GAAGCATCTC AAGATCTTTC CAGGGTTATA CTTACTAGCA CACAGCATGA 1 140 
TCATTACGGA GTGAATTATC TAATCAACAT CATCCTCAGT GTCTTTGCCC ATACTGAAAT 1200 
TCATTTCCCA CTTTTGTGCC CATTCTCAAG ACCTCAAAAT GTCATTCCAT TAATATCACA 1260 
GGATTAACTT TTTTTTTTAA CCTGGAAGAA TTCAATGTTA CATGCAGCTA TGGGAATTTA 1320 
ATTACATATT TTGTTTTCCA GTGCAAAGAT GACTAAGTCC TTTATCCCTC CCCTTTGTTT 1380 
GATTTTTTTT CCAGTATAAA GTTAAAATGC TTAGCCTTGT ACTGAGGCTG TATACAGCAC 1440 
AGCCTCTCCC CATCCCTCCA GCCTTATCTG TCATCACCAT CAACCCCTCC CATACCACCT 1500 
AAACAAAATC TAACTTGTAA TTCCTTG AAC ATGTCAGGAC ATACATTATT CCTTCTGCCT 1560 
GAGAAGCTCT TCCTTGTCTC TTAAATCTAG AATGATGTAA AGTTTTGAAT AAGTTGACTA 1620 
TCTTACTTCA TGCAAAGAAG GGACACATAT GAGATTCATC ATCACATGAG ACAGCAAATA 1680 
CTAAAAGTGT AATTTGATTA TAAGAGTTTA GATAAATATA TG AAATGCAA GAGCCACAGA 1740 
GGGAATGTTT ATGGGGCACG TTTGTAAGCC TGGGATGTGA AGCAAAGGCA GGGAACCTCA 1800 
TAGTATCTTA TATAATATAC TTCATTTCTC TATCTCTATC ACAATATCCA ACAAGCTTTT 1860 
CACAGAATTC ATGCAGTGCA AATCCCCAAA GGTAACCTTT ATCCATTTCA TGGTGAGTGC 1920 
GCTTTAG AAT TTTGGCAAAT C AT ACTGG TC ACTTATCTCA ACTTTGAGAT GTGTTTGTCC 1 980 
TTGTAGTTAA TTGAAAGAAA TAGGGCACTC TTGTGAGCCA CTTTAGGGTT CACTCCTGGC 2040 
AATAAAGAAT TTACAAAGAG CTACTCAGGA CCAGTTGTTA AG AGCTCTGT GTGTGTGTGT 2100 
GTGTGTGTGT GAGTGTACAT GCCAAAGTGT GCCTCTCTCT CTTGACCCAT TATTTCAGAC 2160 
TTAAAACAAG CATGTTTTCA AATGGCACTA TGAGCTGCCA ATGATGTATC ACCACCATAT 2220 
CTCATTATTC TCCAGTAAAT GTG ATAATAA TGTCATCTGT TAACATAAAA AAAGTTTGAC 2280 
TTCACAAAAG CAGCTGGAAA TGGACAACCA CAATATGCAT AAATCTAACT CCTACCATCA 2340 
GCTACACACT GCTTGACATA TATTGTTAGA AGCACCTCGC ATTTGTGGGT TCTCTTAAGC 2400 
AAAATACTTG CATTAGGTCT CAGCTGGGGC TGTGCATCAG GCGGTTTGAG AAATATTCAA 2460 
TTCTCAGCAG AAGCCAGAAT TTGAATTCCC TCATCTTTTA GGAATCATTT ACCAGGTTTG 2520 
GAGAGGATTC AGACAGCTCA GGTGCTTTCA CTAATGTCTC TGAACTTCTG TCCCTCTTTG 2580 
TGTTCATGG A TAGTCCAATA AATA ATGTTA TCTTTG AACT GATGCTCATA GGAGAGAATA 2640 
TAAGAACTCT GAGTGATATC AACATTAGGG ATTCAAAGAA ATATTAGATT TAAGCTCACA 2700 
CTGGTCAAAA GGAACCAAGA TACAAAGAAC TCTGAGCTGT CATCGTCCCC ATCTCTGTGA 2760 
GCCACAACCA ACAGCAGGAC CCAACGCATG TCTGAGATCC TTAAATCAAG GAAACCAGTG 2820 
TCATGAGTTG AATTCTCCTA TTATGGATGC TAGCTTCTGG CCATCTCTGG CTCTCCTCTT 2880 
G ACAC ATATT AGCTTCTAGC CTTTGCTTCC ACG ACTTTTA TCTTTTCTCC AACACATCGC 2940 
TTACCAATCC TCTCTCTGCT CTGTTGCTTT GGACTTCCCC ACAAGAATTT CAACGACTCT 3000 
CAAGTCTTTT CTTCCATCCC CACCACTAAC CTGAATGCCT AGACCCTTAT TTTTATTAAT 3060 
TTCCAATAGA TGCTGCCTAT GGGCTATATT GCTTTAGATG AACATTAGAT ATTTAAAGCT 3120 
CAAGAGGTTC AAAATCCAAC TCATTATCTT CTCTTTCTTT CACCTCCCTG CTCCTCTCCC 3180 
TATATTACTG ATTGCACTGA ACAGCATGGT CCCCAATGTA GCCATGCAAA TGAGAAACCC 3240 
AGTGGCTCCT TGTGGTACAT GCATGCAAGA CTGCTGAAGC CAGAAGGATG ACTGATTACG 3300 
CCTCATGGGT GG AGGGG ACC ACTCCTGGGC CTTCGTGATT GTCAGGAGCA AGACCTG AG A 3360 
TGCTCCCTGC CTTCAGTGTC CTCTGCATCT CCCCTTTCTA ATGAAG ATCC ATAGAATTTG 3420 
CTACATTTGA GAATTCCAAT TAGGAACTCA CATGTTTTAT CTGCCCTATC AATTTTTTAA 3480 
ACTTGCTGAA AATTAAGTTT TTTCAAAATC TGTCCTTGTA AATTACTTTT TCTTACAGTG 3540 
TCTTGGCATA CTATATCA AC TTTG ATTCTT TGTTACAACT TTTCTTACTC TTTTATCACC 3600 
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A A AGTGGCTT TTATTCTCTT TATTATTATT ATTTTCTTTT ACTACTATAT TACGTTGTTA 3660 
TT ATTTT GTT CTCTATAGTA TCAATTTATT TGATTTAGTT TC AATTTATT TTTATTGCTG 3720 
ACTTTTAAAA TAAGTGATTC GGGGGGTGGG AGAACAGGGG AGGGAGAGCA TTAGGACAAA 3780 
TACCTAATGC ATGTGGGACT TAAAACCTAG ATGATGGGTT GATAGGTGCA GCAAACCACT 3840 
ATGGCACACG TATACCTGTG TAACAAACCT ACACATTCTG CACATGTATC CCAGAACGTA 3900 
AAGTAAAATT TAAAAAAAAG TGA 



PEZ8 Protein sequence: 
Protein Accession #: none 

SEQ ID NO:164 PEZ6 DMA SEQUENCE 

Nucleic Acid Accession #: AB028945 

Coding sequence: 1 -3765 (underlined sequences correspond to start and stop codons) 

! 11 21 31 41 51 
I I I I I 1 

ATGATGATGA ACGTCCCCGG CGGAGGAGCG GCCGCGGTGA TGATGACGGG CTACAATAAT 60 
GGTCGCTGTC CCCGGAATTC TCTCTACAGT GACTGCATTA TTGAGGAGAA GACGGTGGTC 120 
CTGCAGAAAA AAGACAATGA GGGCTTTGGA TTCGTGCTTC GAGGGGCCAA AGCTGACACA i 80 
CCCATTGAAG AATTCACACC AACACCGGCT TTCCCAGCCC TACAGTACCT GGAGTCCGTG 240 
GATGAAGGTG GGGTGGCGTG GCAAGCCGGA CTAAGGACCG GGGACTTCTT GATTGAGGTT 300 
AACAATGAGA ATGTTGTCAA AGTCGGCCAC AGGCAGGTGG TGAACATGAT CCGGCAGGG A 360 
GGGAATCACC TGGTCCTTAA GGTGGTCACG GTGACCAGGA ATCTGGACCC CGACGACACC 420 
GCCAGGAAGA AAGCTCCCCC GCCTCCAAAG CGGGCACCGA CCACAGCCCT CACCCTGCGC 480 
TCCAAGTCCA TGACCTCGGA GCTGGAGGAG CTCGTGGATA AAGATAAACC CGAGGAGATA 540 
GTCCCGGCCT CCAAGCCCTC CCGCGCTGCT GAGAACATGG CTGTGGAACC GAGGGTGGCG 600 
ACCATC AAGC AGCGGCCC AG CAGCCGGTGC TTCCCGGCGG GCTCAGACAT GAACTCTGTG 660 
TACGAACGCC AAGGAATCGC CGTGATGACG CCCACTGTTC CTGGGAGCCC AAAAGCCCCG 720 
TTTCTGGGCA TCCCTCGAGG TACGATGCGA AGGCAGAAAT CAATAGACAG CAGAATCTTT 780 
CTATCAGGAA TAACAG AGGA AGAGCGGCAG TTTCTGGCTC CTCCAATGCT GAAGTTCACC 840 
AGAAGCCTGT CCATGCCGGA CACCTCTGAG GACATCCCCC CTCCACCGCA GTCTGTGCCC 900 
CCGTCCCCAC CACCACCTTC CCCAACCACT TACA ACTGCC CCAAGTCCCC A ACTCCAAGA 960 
GTCTACGGGA CG ATTAAGCC TGCGTTCAAT CAG AATTCTG CCGCCAAGGT GTCCCCCGCC 1020 
ACCAGGTCCG ACACCGTGGC CACCATGATG AGGGAGAAGG GGATGTACTT CAGGAGAGAG 1080 
CTGG ACCGCT ACTCCTTGGA CTCTGAAGAC CTCTACAGTC GGAATGCCGG CCCGCAAGCC 1 140 
AACTTCCGCA ACAAGAGAGG CCAGATGCCA GAAAACCCAT ACTCAGAGGT GGGGAAGATC 1200 
GCCAGCAAAG CCGTCTACGT CCCCGCCAAG CCCGCCAGGC GGAAGGGGAT GCTGGTGAAG 1260 
CAGTCCAACG TGGAGGACAG CCCCGAGAAG ACGTGCTCCA TCCCTATCCC GACCATCATC 1320 
GTGAAGGAGC CGTCCACCAG CAGCAGCGGC AAGAGCAGCC AGGGCAGCAG CATGGAGATC 1380 
GACCCCCAGG CCCCGGAGCC ACCGAGCCAG CTGCGGCCTG ACGAAAGCCT GACCGTCAGC 1440 
AGCCCCTTTG CCGCCGCCAT CGCCGGAGCC GTCCGCGACC GTGAGAAGCG GCTGGAAGCC 1500 
AGGAGGAACT CCCCGGCCTT CCTCTCCACA GACCTGGGGG ATGAGGATGT GGGCCTGGGG 1560 
CCACCCGCCC CCAGGACGCG GCCCTCCATG TTCCCCGAGG AGGGGGATTT TGCTGACGAG 1620 
GACAGCGCTG AGCAGCTGTC ATCCCCCATG CCGAGTGCCA CGCCCAGGGA GCCCGAAAAC 1680 
CATTTCGTGG GTGGCGCCGA GGCCAGTGCT CCGGGTGAGG CTGGGAGGCC GCTGAATTCC 1740 
ACGTCCAAAG CCCAGGGGCC CGAGAGCAGC CCAGCAGTGC CCTCCGCGAG CAGCGGCACA 1800 
GCCGGCCCCG GGAATTATGT CCACCCACTC ACAGGGCGGC TGCTTGATCC CAGCTCCCCG 1860 
CTGGCCCTGG CACTCTCCGC AAGGGACCGA GCCATGAAGG AGTCTCAACA GGGACCCAAA 1920 
GGGGAGGCCC CCAAGGCCGA CCTCAACAAA CCTCTTTACA TTGATACCAA AATGCGGCCC 1980 
AGCCTGGATG CCGGCTTCCC TACGGTCACC AGGCAGAACA CCCGGGGACC CCTGAGGCGG 2040 
CAGGAGACGG AGAACAAGTA CGAGACCGAC CTGGGCCGAG ACCGGAAAGG CGATGACAAG 2100 
AAGAACATGC TGATCGACAT CATGGACACG TCCCAGCAGA AGTCGGCTGG CCTGCTGATG 2160 
GTGCACACCG TGGACGCCAC TAAGCTGGAC AACGCCCTGC AGGAAGAGGA CGAGAAGGCA 2220 
GAGGTGGAG A TG AAGCCAGA CAGCTCGCCG TCCGAGGTGC CAGAAGGTGT TTCCGAAACC 2280 
GAAGGTGCTT TACAG ATCTC CGCTGCCCCC GAGCCCACCA CCGTGCCCGG CAGAACCATC 2340 
GTCGCGGTGG GCTCCATGGA AGAGGCGGTG ATTTTGCCAT TCCGCATCCC TCCTCCCCCT 2400 
CTGGCATCCG TGGACTTGGA TG AGGATTTT ATTTTTACAG AGCCATTGCC TCCTCCCCTG 2460 
GAATTTGCAA ATAGTTTTGA TATCCCCGAT GACCGGGCAG CTTCTGTCCC GGCTCTCTCA 2520 
GACTTAGTGA AGC AG A AG A A AAGCGACACC CCTCAGTCCC CTTCGTTGAA CTCCAGCCAA 2580 
CCAACCAACT CTGCAGACAG CAAGAAGCCA GCCAGTCTTT CAAACTGTCT GCCTGCCTCA 2640 
TTCCTGCCAC CCCCTG A A AG CTTTGACGCC GTCGCCGACT CTGGGATCGA GGAGGTGGAC 2700 
AGCCGGAGTA GCAGCGACCA CCACCTCGAG ACGACCAGCA CTATCTCCAC CGTGTCTAGC 2760 
ATCTCCACCC TGTCTTCCGA AGGTGGAGAG AATGTGGACA CCTGCACAGT CTATGCAGAT 2820 
GGGCAAGCAT TTATGGTTGA CAAACCCCCA GTACCTCCTA AGCCAAAAAT GAAGCCCATC 2880 
ATTCACAAAA GCAATGCACT TTATCAAGAC GCGCTCGTGG AAG AAGATGT AGATAGCTTT 2940 
GTTATCCCCC CGCCCGCTCC CCCGCCCCCG CCGGGCAGTG CCCAGCCTGG GATGGCCAAG 3000 
GTTCTCCAGC CAAGGACCTC CAAGTTGTGG GGCGACGTCA CAGAGATCAA AAGCCCGATT 3060 
CTCTCAGGCC CAAAGGCAAA CGTTATTAGT GAATTGAACT CTATCCTACA GCAAATGAAC 3120 
CGAGAGAAAT TGGCAAAGCC GGGGGAAGGA CTGGATTCAC CAATGGGAGC CAAGTCCGCC 3180 
AGCCTCGCTC CAAGAAGCCC GGAGATCATG AGCACCATCT CAGGTACACG GAGCACGACG 3240 
GTCACCTTCA CTGTTCGCCC CGGCACCTCC CAGCCCATCA CCCTGCAGAG CCGGCCCCCC 3300 
GACTATGAAA GCAGGACCTC AGGAACAAGA CGTGCCCCAA GCCCTGTGGT CTCGCCAACA 3360 
GAGATGAACA AAG AGACCCT GCCCGCCCCC CTGTCTGCTG CCACCGCCTC TCCTTCTCCC 3420 
GCTCTCTCAG ATGTCTTTAG CCTTCCAAGC CAGCCCCCTT CTGGGGATCT ATTTGGCTTG 3480 
AACC CAGCG G GACGCAGTAG GTCGCCATCC CCCTCGATAC TGCAACAGCC AATCTCAAAT 3540 
AAGCCTTTTA CAACTAAACC TGTCCACCTG TGGACTAAAC CAGATGTGGC CGATTGGCTG 3600 
GAAAGTCTAA ACTTGGGTGA ACATAAAGAG GCCTTCATGG ACAATGAGAT CGATGGCAGT 3660 
CACTTACCAA ACCTGCAGAA GGAGGACCTC ATCGATCTTG GGGTAACTCG AGTCGGGCAC 3720 
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AGAATGAACA TAGAAAGGGC TTTGAAACAG CTGCTGGACA GATAAGGACG GCTGCTCTCC 3780 
ACCTCGCAGA CTGCTCTTGT TATAAGTAG A GATGGGCTCG TGCTGAAACA TCTGAATGCC 3840 
AAGCGAAGTC TGTGAGCATC AACCCCACTC CATGGGTTTG TCTCCTGGTA CCCAAAGAAA 3900 
TACTGAGTTG TGTCC ACAAC ATGGCTGGGT CTTCAGACCC CTGGCT CACC ATGTGGGTGT 3960 
5 CTTGGGCAGT TTCTATCACA CATGGGACAA GGGGAGGGAG TTTTTCTAAC ATGGAAAAAG 4020 
ATTCCCAGCC TGCCGCCCAG CATGCAGGTG GCCTCGCTTT GCCGGGTCCG AGAGGCTCCC 4080 
CGTCAATTTT GCACGGGATC CTAGCTCTTG TAGGCAGACA CCAGTGCACT CTAGATACCT 4140 
CCTG AGACCT CCGTCCTCTG CTTTCCGGGC AGCTCTCACC ACCCCAGGCC CCGGCATGAG 4200 
GCCTTTCCTC AGTCCTGTGG CCTCTCAGAG GACACCTGAT GCTCACCTGC CCCTCTTTCT 4260 
1 0 CCTGCACTTG GCTTGCAGTG AG ATGCTCCC AG ATGCATTT GTCCAGTGCC CCATC ATGGG 4320 
CCTG AAAGGC AGAGAAACTT TTTCCTACAC AGATTCTTTT CCCCATCTCC TCCTGTGGTT 4380 
TGCATCCATG GCTCTTTGGC CATGAGGTTC CTGGCAGTGC TGGGAGTTTG GATGGGATCG 4440 
TGCCCAGCTT TGCTTAGCTT TCTTTATTTC TGC A AATCTG TTAGCATAAT TCC A AGGTGG 4500 
CCAAGCAGAT GTCACATGGA GTTAGTCAAA GCACAAAGTC ACGATTCCAC AATGG AGGGG 4560 
1 5 AGACCTGGCC AAGGG AGCCA GCCAGCGTGC AACTGCCCAA GCTCCAGGTC TCCAGGACAA 4620 
GAGCAGTTGT CTGCCATG AG CACCCATCCA GGATGGAGAA TAAGGGCTTC TCTGCCTCTC 4680 
AGAATTCTTT TTAATTG AAG ATGTCTTGAG CTCTGCAAAG ATCAGAGCAG GTGAGCATCC 4740 
ACTTTGACAT GAAGGACAAG AAGACGCATG GCTCATGGCG GGCACATGCG GCTGCCAGTG 4800 
AGACAGCGTC TCCTCTGGGA GCTGGGCGGG CACAGCATCC TCAGTTCTGT GCCCAGCCAA 4860 
20 GGGTGAGC AT CTCTGCTGAG ACAGTCCTTT TGCTCTCGG A GGCCAGGG AA G ATGGTACTT 4920 
AGAGGCTTTT CCCCTATCGC TCTGGGTGTC TAGGAATCCC ACCAGCTTGT CTTAACAGTA 4980 
CAACAGCTTC TTTGAGGACC CAGTGGGTAT GGAGTATAGA CAGAACCCAG GGTTGAG AAC 5040 
AGAAGGTGGG CGGCAGGATC AGAGTGAAAG CAGAGGCGTG AGGAGAGGAA AGCAGGGAGG 5100 
TCTCCTGGGC TGCCAGGTCA GCCTCTCTGG CAAGGCTTTC TTGAGCCCCG CCCCTTTCTT 5 160 
JZ5 TCCCCGGAGT CCCTCCACCC CATAACAATA CCTCGAATTT CCAAAAGAGG TCACCAGATG 5220 
™ : CAC ATGGGCC GCAAA ACACA CAGTC AGGCT TCCAGC ACAT TCTCCCCCAT TTGGAGGATA 5280 

^ CTCGAATGTC AGGTTTTTGG TTTTATTATT ATTTCAGAAC TAGCTCAGCC CATCTCTAAT 5340 

II TATA AAACAT GGTTTTGTTT TTTTTTTTTC CTTTTTTTCT TGATTAGGTC TGGAACAGCT 5400 

':■ CTAGAATGAA CAC ATAA A AT TTAGCAATTT AAAATCTTTC TTTACTGCA A GTTTAAATAG 5460 

H30 TTGTACAGAT AGTTTATAAG CACAATATTT TAAGAAAAAA AAGTGGCTGG TCTACTAGGC 5520 
fl AGCCTTTGTG CCACTTCAGT GCTAGAAAGT TAAAGAAAAA AAAACTTTTG TGATTTAATA 5580 

~ ATACTATTTC TGTGG AATAA TTATAAA AGT ATGACCTTTT TAAATC AACC TTATTTGGAT 5640 

M- GCATCTGAAC CAGCAGAGCT GTGTTATATT TTCTATCTTT GCTAGAACTT CGTCATTGAA 5700 

Ft GGACAATTTC TTCAAAGTGG TTACAATTCA TAATGCAGCA GTTTCTCCAA AAACAAAAAC 5760 

: J3 5 AAAACAC ACA CCACACAC AC GCGCTTTTCC AGTC ACACAC CCCTG ATGTT GGAACCAAGT 5820 
V TTTTGGACCT TCTGTTCCAA AACCTTTTGC AGGTCAATCT TTGTATTTGA AATGATCCAA 5880 

TCCAACTTG A AGTCAATTG A ATATTA AGGC GCTTTACTTC CGTGTGCTTT CAGTTTTTCC 5940 
ATCATGAGAT GAATGAGCAT TACTCTAGAT AAATTTCAAG ACAGGATACT ACAGGTGGCC 6000 
" ; a s\ TGCTG AGGCT GCCCCATATT TTAG AA A ATG TAAAAATGGT GGTTTGGCCA TTAATTTGTC 6060 
AO TTCCATTTGA TGATACCGCA AAATTCCGTG AGTCCATTCC TTTGGCATGG CACTTTCCCT 6120 
^- GGGCCTACAG TTGGTATTAC CTCTGTGCTC AGTGCCAGGC AAAACACTAG CTCAAAGGAG 6180 

AGTCAAGGAA ACCGCTGGCA G ACGATAACC AGTCGAAACT CGTG ACTTCG GTTTGTTGAA 6240 
CTTTGGCAGC CAGTTGGTGA GGGCCAGATG TTATTCCCTT TCTTAAAGAT ACTCCAAGCC 6300 
■J AC ATGCCACT AACCACAAGC AAGCTGGCTG CAAGACTAAA GAGCTGATAA CATAGTTTAT 6360 

^45 TTTTACACTG TCTTATTATA GAGAAGTAAT AGACCTATCA GAACCTGCAC TG ACC AACAA 6420 
^ ATAAACACAT GTTGCCAAGA TGAATCGGTC TCTATCTCTA TCTGCTTATT TTGGTACTGA 6480 

« AAGCAATAGT TCCTCATTCA AATCACCACC CACTGTTCTC CCCCTTTGGG ACATGTTAGG 6540 

ACG AGGCCCT ATTCCATGCC CCTCTTTA AT GGTGG AACAA ATGTTAAACT GCTCATCTAA 6600 
AGATCATGTT G ATATTATTC CAGGTTTTAA GATCAACTTT TGTTACATAC TGTAATTTAA 6660 
50 ATAAACTGCA TTTACATGCC TAGTTTCTGT AATATTGTGT ATACAAAACC CAAATCTCTC 6720 
AAAATGTAAA TTATGTATAC CTGCCAAGAT ACCTTTTCCA GGGTGTCTGC GCACATTTTA 6780 
AGTTAATTCA CATAATATAA AAATTACTCA ATGTG ACTGT TGATTTGCTG AACTTTACAT 6840 
ATCACAAAGT GAATTATTTG TGATACTTTA GTTAATAAAA TGGTAAATTT TTTTCTC AGT 6900 
TATTG AACAA GCAAGCATTA TCCAGTTGAT CTGGCAATGA CTTTTTGTGT GTGGGCCACA 6960 
5 5 ATATTG ATTT TCCCATTA AC AATTTTTTTT TGTTTTTTAA ATACTAATAT GTTTCACACT 7020 
ATAGTTTGTG TAACAACACG TGTTCGCATT ATCTATGTTG CTGTTACTTT TGTGCTTTTA 7080 
TTCTTTTTAG ACTTTATAAA AAAAAAAAAA AGCTCCTGTA ATTTGCACTT TCTCCCAATC 7140 
CTTAAATCTC TTGTATGGCA ACCAAAATTA CTGTAAAAAA ATAAATATAC TATTGCACTA 7200 
. AGGTTGTGGT TCTGATTGCA AACAAACAGT GAACACTGTC TGAATTAAAC AAAAAGCTGC 7260 
60 CCGACTTGCA ATCTAATGTA GATTATCTCA GGCATTGTGG CCAGCTCTGC CTCTCTAAAA 7320 
CTGACCAGAA AAATCTCTCT CATCGAGTAA ACAGGCTCCT GTCACTGAGC TAATCTGCCT 7380 
TGGTTCCATT TCCTTATTCT CAATTTATCA ATGGATACGT GCATGTTATT TCAGAATTAT 7440 
GCAAAACGTC AAAATCTGCT TCTGTGACCG CTGCTATAGG CGTGGAGCTG AGGCTCGGCT 7500 
TTTCCTTTTG TTCTGGGTGG A AGCAGCGGT GCCGCGGAGG GCCAGCCAGA TCCGG ACCCT 7560 
65 TCCCTTAGGG TCCAGTCTCC CCACACCCCA GCAGGGTGTC TTCTAGCCAT AAGGCCAAGG 7620 
G AGTGGC AG A ACTGGGCCGC CTCTCTGGTT G AC A AGCAAA CCACATGCTA AGGCTTGGAG 7680 
CAAGAGAGAA TTTGTGTCTA TTGGCAAAGA ACTAAGCCAG G AAG ACATGG GCCATCCCTC 7740 
CGCTTTAGGG AAGCATATTT TAAACCTAAA CGTTGAACTT CTTCTTTGGC CTCACCAGTG 7800 
AAAACTTGTT GTCTTTAGTT CCTAAAGTTT CTTCTACTTT GGCACATTCC CCAGTTGAGC 7860 
70 AGCAGCCTCT ATGCTTCCAC GTTCAGGAAA AATTCCAGTC CTCATATCTT TTGTAGTTCA 7920 
CCCTCAAGCT CTCCCGCTTC ACCATCCAAT AG TTTCTCCC AAACCTTGGC ACCCCCCTAG 7980 
ACTTTGCTTC CAATGGTTTC TTCCAGACCA CTTTTCCTAG ATGAATATAT TCGTTTACCT 8040 
TACTAGGAAA ATTATTGGAA GATTTTTTCT TTTACTTGAA ATTGG AGGCA TTTTAATAAC 8100 
TGGCGAACTG GAATGTGTTT CTGTATTTGT AG ACAACCAT GTACCCATGC AAGTAGGTG A 8 160 
75 ACATTCCACA GTGGCTGGGT GACCACAGCA GCTGCATGCA GACAGGACTG CCCGTGCTTT 8220 
GTGGGG AATC AGAG AATTTC CAAACTTGTT TCTC AG ACTT CCGCAG ATCT CATCACTTTG 8280 
ATTTCTAATC CATGCTGTAT TGGTGATTTT GTTTATCGTT CCTGTAACTT GTTCTACATT 8340 
CCACAGTCTT TACCGTTTTA TGTTCAAAAT TACAACAATC CCTGTCCATT GATTCCACTC 8400 
TGGAACTCTT TGTTCATGCC AATTTTG AAA TTTTAATACG AGCCTTCAAA TAAACACAGA 8460 
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AAAGAAAAAA AAAAAAAAAA AAAAAAAA 



SEQ ID NO:165 PEZ6 Protein sequence: 
5 Protein Accession*: BAA82974.1 



I 11 21 31 41 51 

1 0 MMMNVPGGG A AAVMMTG YNN GRCPRNSLYS DCIIEEKTV V LQKKDNEGFG FVLRG AKADT 60 
PIEEFTPTPA FPALQYLESV DEGGVAWQAG LRTGDFL1EV NNENVVKVGH RQVVNMIRQG 120 
GNHLVLKVVT VTRNLDPDDT AJRKKAPPPPK RAPTTALTLR SKSMTSELEE LVDKDKPEEI 180 
VPASKPSRAA ENMAVEPRVA TIKQRPSSRC FPAGSDMNSV YERQGIAVMT PTVPGSPKAP 240 
FLGIPRGTMR RQKSIDSRIF LSGITEEERQ FLAPPMLKFT RSLSMPDTSE DIPPPPQS VP 300 

1 5 PSPPPPSPTT YNCPKSPTPR VYGTIKPAFN QNS AAKVSPA TRSDTVATMM REKGMYFRRE 360 

LDRYSLDSED LYSRNAGPQA NFRNKRGQMP ENPYSEVGKI ASKAVYVPAK PARRKGMLVK 420 
QSNVEDSPEK TCSIPIPTH VKEPSTSSSG KSSQGSSMEI DPQAPEPPSQ LRPDESLTVS 480 
SPFAAAIAGA VRDREKRLEA RRNSPAFLST DLGDEDVGLG PPAPRTRPSM FPEEGDFADE 540 
DSAEQLSSPM PSATPREPEN HFVGGAEASA PGEAGRPLNS TSKAQGPESS PAVPSASSGT 600 

20 AGPGNYVHPL TGRLLDPSSP LALALSARDR AMKESQQGPK GEAPKADLNK PLYIDTKMRP 660 
SLDAGFPTVT RQNTRGPLRR QETENKYETD LGRDRKGDDK KNMLIDIMDT SQQKSAGLLM 720 
VHTVDATKLD NALQEEDEKA EVEMKPDSSP SEVPEG VSET EGALQISAAP EPTTVPGRTI 780 
VAVGSMEEAV ILPFRIPPPP LASVDLDEDF IFTEPLPPPL EFANSFDIPD DRAASVPALS 840 
DLVKQKKSDT PQSPSLNSSQ PTNSADSKKP ASLSNCLPAS FLPPPESFDA VADSGIEEVD 900 

25 SRSSSDHHLE TTSTISTVSS ISTLSSEGGE NVDTCTVYAD GQAFMVDKPP VPPKPKMKPI 960 
IHKSNALYQD ALVEEDVDSF VIPPPAPPPP PGSAQPGMAK VLQPRTSKLW GDVTEIKSPI 1020 
LSGPKANVIS ELNSILQQMN REKLAKPGEG LDSPMGAKSA SLAPRSPE1M STISGTRSTT 1080 
VTFTVRPGTS QPITLQSRPP DYESRTSGTR RAPSPVVSPT EMNKETLPAP LSAATASPSP 1 140 
ALSDVFSLPS QPPSGDLFGL NPAGRSRSPS PSILQQPISN KPFTTKPVHL WTKPDVADWL 1200 

30 ESLNLGEHKE AFMDNEIDGS HLPNLQKEDL 1DLGVTRVGH RMNIERALKQ LLDR 



SEQ ID NO:166 PEZ4 DNA SEQUENCE 

Nucleic Acid Accession #: NM_000024 
3 5 Coding sequence: 220-1 461 (underlined sequences correspond to start and stop codons) 

1 II 21 31 41 51 

ACTGCGAAGC GGCTTCTTCA GAGCACGGGC TGGAACTGGC AGGCACCGCG AGCCCCTAGC 60 

40 ACCCGACAAG CTG AGTGTGC AGGACGAGTC CCCACCACAC CCACACCACA GCCGCTGAAT 120 
GAGGCTTCCA GGCGTCCGCT CGCGGCCCGC AGAGCCCCGC CGTGGGTCCG CCCGCTGAGG 180 
CGCCCCCAGC CAGTGCGCTT ACCTGCCAG A CTGCGCGCCATGGGGCAACC CGGGAACGGC 240 
AGCGCCTTCT TGCTGGCACC CAATAGAAGC CATGCGCCGG ACCACGACGT CACGCAGCAA 300 
AGGGACG AGG TGTGGGTGGT GGGCATGGGC ATCGTCATGT CTCTCATCGT CCTGGCCATC 360 

45 GTGTTTGGCA ATGTGCTGGT CATCACAGCC ATTGCCAAGT TCGAGCGTCT GCAGACGGTC 420 
ACCAACTACT TCATCACTTC ACTGGCCTGT GCTG ATCTGG TCATGGGCCT GGCAGTGGTG 480 
CCCTTTGGGG CCGCCCATAT TCTTATGAAA ATGTGGACTT TTGGCAACTT CTGGTGCGAG 540 
TTTTGGACTT CCATTGATGT GCTGTGCGTC ACGGCCAGCA TTG AGACCCT GTGCGTG ATC 600 
GCAGTGGATC GCTACTTTGC CATTACTTCA CCTTTCAAGT ACCAGAGCCT GCTGACCAAG 660 

50 AATAAGGCCC GGGTGATCAT TCTGATGGTG TGGATTGTGT CAGGCCTTAC CTCCTTCTTG 720 
CCCATTCAGA TGCACTGGTA CCGGGCCACC CACCAGGAAG CCATCAACTG CTATGCCAAT 780 
GAGACCTGCT GTGACTTCTT CACGAACCAA GCCTATGCCA TTGCCTCTTC CATCGTGTCC 840 
TTCTACGTTC CCCTGGTGAT CATGGTCTTC GTCTACTCCA GGGTCTTTCA GGAGGCCAAA 900 
AGGCAGCTCC AGAAG ATTGA CAAATCTGAG GGCCGCTTCC ATGTCCAGAA CCTTAGCCAG 960 

55 GTGGAGCAGG ATGGGCGGAC GGGGCATGGA CTCCGCAGAT CTTCCAAGTT CTGCTTGAAG 1020 
GAGCACAAAG CCCTCAAGAC GTTAGGCATC ATCATGGGCA CTTTCACCCT CTGCTGGCTG 1080 
CCCTTCTTCA TCGTTAACAT TGTGCATGTG ATCCAGGATA ACCTCATCCG TAAGGAAGTT 1 140 
TACATCCTCC TAAATTGGAT AGGCTATGTC AATTCTGGTT TCAATCCCCT TATCTACTGC 1200 
CGGAGCCCAG ATTTCAGGAT TGCCTTCCAG GAGCTTCTGT GCCTGCGCAG GTCTTCTTTG 1260 

60 AAGGCCTATG GGAATGGCTA CTCCAGCAAC GGCAACACAG GGGAGCAGAG TGGATATCAC 1320 
GTGGAACAGG AGAAAGAAAA TAAACTGCTG TGTGAAGACC TCCCAGGCAC GGAAGACTTT 1380 
GTGGGCCATC AAGGTACTGT GCCTAGCGAT AACATTGATT CACAAGGGAG GAATTGTAGT 1440 
AC A AATG ACT CACTGCTGTAAAGCAGTTTT TCTACTTTTA AAGACCCCCC CCCCCCCAAC 1500 
AG AACACTAA ACAGACTATT TAACTTGAGG GTAATAAACT TAGA ATAAA A TTGTAAAAAT 1560 

65 TGTATAGAGA TATGCAGAAG GAAGGGCATC CTTCTGCCTT TTTTATTTTT TTAAGCTGTA 1620 
AAAAGAGAGA AAACTTATTT G AGTGATTAT TTGTTATTTG TACAGTTCAG TTCCTCTTTG 1680 
CATGGAATTT GTAAGTTTAT GTCTAAAGAG CTTTAGTCCT AGAGGACCTG AGTCTGCTAT 1740 
ATTTTCATGA CTTTTCCATG TATCTACCTC ACTATTCAAG TATTAGGGGT AATATA TTGC 1800 
TGCTGGTAAT TTGTATCTGA AGGAGATTTT CCTTCCTACA CCCTTGGACT TGAGGATTTT 1860 

70 GAGTATCTCG GACCTTTCAG CTGTGAACAT GGACTCTTCC CCCACTCCTC TTATTTGCTC 1920 

ACACGGGGTA TTTTAGGCAG GGATTTGAGG AGCAGCTTCA GTTGTTTTCC CGAGCAAAGG 1980 
TCTAAAGTTT ACAGTAAATA AAATGTTTGA CCATG 



75 SEQ ID WO:167 PEZ4 Protein sequence: 
Protein Accession #: NP_00001 5.1 



1 11 21 31 41 51 
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I I I I i I 

MGQPGNGSAF LLAPNRSHAP DHDVTQQRDE VWVVGMGIVM SLIVLAIVFG NVLVITAIAK 60 
FERLQTVTNY FITSLACADL VMGLA VVPFG AAHILMKMWT FGNFWCEFWT SIDVLCVTAS 1 20 
IETLCVIA VD RYFAITSPFK YQSLLTKNKA RVIILMVWIV SGLTSFLPIQ MHWYR ATHQE 1 80 
ADsCYANETC CDFFTNQAYA IASSIVSFYV PLVIMVFVYS RVFQEAKRQL QKIDKSEGRF 240 
HVQNLSQVEQ DGRTGHGLRR SSKFCLKEHK ALKTLGUMG TFTLCWLPFF IVNIVHVIQD 300 
NLKKEVYIL LNWIGYVNSG FNPLIYCRSP DFRIAFQELL CLRRSSLKAY GNGYSSNGNT 360 
GEQSGYHVEQ EKENKLLCED LPGTEDFVGH QGTVPSDNID SQGRNCSTND SLL 



SEQ ID NO:168 PEZ1 DNA SEQUENCE 

Nucleic Acid Accession #: NM_004457 
15 Coding sequence: 143-2305 (underlined sequences correspond to start and stop codons) 

1 II 21 31 41 51 
on I I I I I I 

ZU GAATTCGTTG TTGGG A AGG A CTGGGGAAAC AGCTGTAACA TTTGCCACCC TCAGAAGCTG 60 
CTGGTCCTGT GTCAC ACC AC CTTAGCCTCT TG ATCG AGG A AGATTCTCGC TG AAGTCTGT 1 20 
TAATTCTACT TTTTGAGTAC T TATG AATAA CCACGTGTCT TCAAAACCAT CTACC ATG A A 1 80 
GCTAAAACAT ACCATCAACC CTATTCTTTT ATATTTTATA CATTTTCTAA TATCACTTTA 240 
TACTATTTTA ACATACATTC CGTTTTATTT TTTCTCCGAG TCAAG ACAAG AAAAATCAAA 300 

25 CCGAATTAAA GCAAAGCCTG TAAATTCAAA ACCTGATTCT GCATACAGAT CTGTTAATAG 360 
TTTGGATGGT TTGGCTTCAG TATTATACCC TGGATGTGAT ACTTTAG ATA AAGTTTTTAC 420 
ATATGCAAAA AACAAATTTA AGAACAAAAG ACTCTTGGGA ACACGTGAAG TTTTAAATGA 480 
GGAAGATGAA GTACAACCAA ATGGAAAAAT TTTTAAAAAG GTTATTCTTG GACAGTATAA 540 
TTGGCTTTCC TATGAAG ATG TCTTTGTTCG AGCCTTTAAT TTTGGAAATG G ATTACAG AT 600 

30 GTTGGGTCAG A A ACC A A AG A CCAACATCGC CATCTTCTGT GAGACCAGGG CCGAGTGGAT 660 
GATAGCTGCA CAGGCGTGTT TTATGTATAA TTTTCAGCTT GTTACATTAT ATGCCACTCT 720 
AGGAGGTCCA GCCATTGTTC ATGCATTAAA TGAAACAGAG GTGACCAACA TCATTACTAG 780 
TAAAGAACTC TTACAAACAA AGTTGAAGGA TATAGTTTCT TTGGTCCCAC GCCTGCGGCA 840 
CATCATCACT GTTGATGGAA AGCCACCGAC CTGGTCCGAC TTCCCCAAGG GCATCATTGT 900 

3 5 GCATACC ATG GCTGCAGTGG AGGCCCTGGG AGCCAAGGCC AGCATGG AAA ACCAACCTCA 960 
TAGCAAACCA TTGCCCTCAG ATATTGCAGT AATCATGTAC ACAAGTGGAT CCACAGGACT 1020 
TCCAAAGGGA GTCATGATCT CACATAGTAA CATTATTGCT GGTATAACTG GGATGGCAGA 1080 
AAGGATTCCA GAACTAGGAG AGGAAG ATGT CTACATTGGA TATTTGCCTC TGGCCCATGT 1 140 
TCTAGAATTA AGTGCTGAGC TTGTCTGTCT TTCTCACGGA TGCCGCATTG GTTACTCTTC 1 200 

40 ACCAC AG ACT TTAGCAGATC AGTCTTCAAA AATTAAAAAA GG A AGCAAAG GGG ATACATC 1 260 
CATGTTGAAA CCAACACTGA TGGCAGCAGT TCCGGAAATC ATGGATCGGA TCTACAAAAA 1320 
TGTCATGAAT AAAGTCAGTG A A ATG AGT AG TTTTCAACGT AATCTGTTTA TTCTGGCCTA 1380 
TAATTACAAA ATGGAACAGA TTTCAAAAGG ACGTAATACT CCACTGTGCG ACAGCTTTGT 1440 
TTTCCGGAAA GTTCGAAGCT TGCTAGGGGG AAATATTCGT CTCCTGTTGT GTGGTGGCGC 1500 

45 TCCACTTTCT GC AACC ACGC AGCGATTC AT G AACATCTGT TTCTGCTGTC CTGTTGGTCA 1 560 

GGGATACGGG CTCACTG AAT CTGCTGGGGC TGGAACAATT TCCGAAGTGT GGGACTACAA 1620 
TACTGGCAGA GTGGGAGCAC CATTAGTTTG CTGTGAAATC AAATTAAAAA ACTGGGAGGA 1680 
AGGTGGATAC TTTAATACTG ATAAGCCACA CCCCAGGGGT GAAATTCTTA TTGGGGGCCA 1740 
AAGTGTGACA ATGGGGTACT ACAAAA ATG A AGCAAAAAC A AAAGCTG ATT TCTCTGAAG A 1 800 

50 TGAAAATGGA CAAAGGTGGC TCTGTACTGG GGATATTGGA GAGTTTGAAC CCGATGGATG 1860 
CTTAAAGATT ATTG ATCGTA AAAAGGACCT TGTAAAACTA CAGGCAGGGG AATATGTTTC 1920 
TCTTGGGAAA GTAGAGGCAG CTTTGAAGAA TCTTCCACTA GTAGATAACA TTTGTGCATA 1980 
TGCAAACAGT TATCATTCTT ATGTCATTGG ATTTGTTGTG CCAAATCAAA AGG A ACT A AC 2040 
TGAACTAGCT CGAAAGAAAG G ACTTAAAGG GACTTGGGAG GAGCTGTGTA ACAGTTGTGA 2100 

55 AATGGAAAAT GAGGTACTTA AAGTGCTTTC CGAAGCTGCT ATTTCAGCAA GTCTGGAAAA 2160 
GTTTG AAATT CCAGTAA A A A TTCGTTTGAG TCCTGAACCG TGGACCCCTG AAACTGGTCT 2220 
GGTGACAGAT GCCTTCAAGC TGAAACGCAA AGAGCTTAAA ACACATTACC AGGCGGACAT 2280 
TGAGCGAATG TATGGAAGAA A ATAA TTATT CTCTTCTGGC ATCAGTTTGC TACAGTGAGC 2340 
TCACATC AA A TAGGAAAATA CTTG A A ATGC ATGTCTCAAG CTGCA AGGCA AACTCCATTC 2400 

60 CTCATATTAA ACTATTACTT CTC ATGACGT CACCATTTTT AACTGACAGG ATTAGTAAAA 2460 
C ATTA AG AC A GCAAACTTGT GTCTGTCTCT TCTTTCATTT TCCCCGCCAC CAACTTACTT 2520 
TACCACCTAT GACTGTACTT GTCAGTATGA GAATTTTTCT GAATCATATT GGGGAAGCAG 2580 
TGATTTTAA A ACCTCAAGTT TTT AAACATG ATTTATATGT TCTGTATAAT GTTCAGTTTG 2640 
TAACTTTTTA AAAGTTTGGA TGTATAG AGG GATAAATAGG AAATATAAGA ATTGGTTATT 2700 

6 5 TGGGGGCTTT TTTACTTACT GTATTTAA A A ATAC AAGGGT ATTG ATATGA A ATTATGTA A 2760 

ATTTCAAATG CTTATGAATC AAATCATTGT TGAACAAAAG ATTTGTTGCT GTGTAATTAT 2820 
TGTCTTGTAT GCATTTGAG A G AAATA A ATA TACCCATACT TATGTTTTAA GAAGTTGAGA 2880 
TCTTGTGAAT ATATGCCTGT CAGTGTCTTC TTTATATATT TATTTTTTAT TAGAAAAAAT 2940 
GAAGTTTGGT TGGTGATGCA TGAAACAAAA TAGCAAGAGA GGGTTATAGT TTAATAGTAA 3000 
70 GGGAGATAAC ACAGCATGTG TAGCACCAGT TGATAATTGG TCTCTAGTAG CTTACTGTCA 3060 
AAATGTTCAA TGAAGTCTTC TGTTCATCTG TTGAAACTAG GAAAATACCC AAACTTAAAT 3120 
GGAAGAATTC TGAAAGAGAG GATAGAATTT AAAGAACAAG AGTATATAAA GTTATTCTTT 3180 
GAATATTTCG TTGACTATAT GTACATTGAG TTATCTATAT TTGTAAACAA ATTAGTCATG 3240 
GAAAATTATT CTATTCCAAA GTCTCCTTTT AGTCTAG ATA ATCATTATTT C ATTTTAAAA 3300 

7 5 TTAGTGTTTT TCATAGTTTG CACTG ATGCG TGTATGG ATG TGTGTGAGTC AGTGGTAGCT 3360 

TATTTAAAAA GCACCTTATC CTTTCTCCCA TAACCTTTGT ACACTAAAAA ATGAAAGAAT 3420 
TTAGAATGTA TTTG ATGATA GCATTCTCAC TAAGACACAT GAGAATTTAA CTTTATAACC 3480 
GCGTGAGTTA AG ATTTAATT CATAGGTTTT G ATGTCATTG TTGAAGTTAT TTGTAATTCA 3540 
G A A ACCTTGC TTGTGTG ATA CATAGTAAGT CTCTTCATTT ATTACTGCTT GCCTGTTGTT 3600 
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ATATCTGGAT TATCAAAAGC AATAGTGCAC CAATTAAGAT GTGCTCAAAT CAGGACTTAA 3660 
ATCATAGGCA CCACATTTTT CATGTC AGAC TAGTTACTTT GTTGATTCTC AGTTACTGTA 3720 
GGCATCAAAA GGCAAAAATC A 

5 

SEQ ID NO: 169 PEZ1 Protein sequence; 
Protein Accession #: NPJJ04448.1 

1 11 21 31 41 51 

10 | [ | | ! [ 

MNNHVSSKPS TMKLKHTINP ILLYFIHFLI SLYTILTYIP FYFFSESRQE KSNR1KAKPV 60 
NSKPDSAYRS VNSLDGLASV LYPGCDTLDK VFTYAKNKFK NKRLLGTREV LNEEDEVQPN 120 
GKIFKKVILG QYNWLSYEDV FVRAFNFGNG LQMLGQKPKT NIAIFCETRA EWMIAAQACF 180 
MYNFQLVTLY ATLGGPAIVH ALNETEVTNI ITSKELLQTK LKDIVSLVPR LRHIITVDGK 240 

1 5 PFTWSDFPKG IIVHTMAAVE ALGAKASMEN QPHSKPLPSD 1AVIMYTSGS TGLPKGVMIS 300 
HSNIIAGITG MAERIPELGE EDVYIGYLPL AHVLELSAEL VCLSHGCRIG YSSPQTLADQ 360 
SSK1KKGSKG DTSMLKPTLM AAVPEIMDRI YKNVMNKVSE MSSFQRNLFI LAYNYKMEQi 420 
SKGRNTPLCD SFVFRKVRSL LGGNIRLLLC GGAPLSATTQ RFMNICFCCP VGQGYGLTES 480 
AG AGTISEVW DYNTGRVGAP LVCCEIKLKN WEEGG YFNTD KPHPRGEILI GGQSVTMGYY 540 

20 KNEAKTKADF SEDENGQRWL CTGDIGEFEP DGCLK1IDRK KDLVKLQAGE YVSLGKVEAA 600 
LKNLPLVDNI CAYANSYHSY V1GFVVPNQK ELTELARKKG LKGTWEELCN SCEMENEVLK 660 
VLSEAAISAS LEKFEIPVKI RLSPEPWTPE TGLVTDAFKL KRKELKTHYQ ADIERMYGRK 

SEQ ID NO:170 PCQ7 DNA SEQUENCE 

25 Nucfeic Acid Accession #; none found 

Coding sequence: 38-1075(underlined sequence corresponds to start and stop codon) 



1 11 21 31 41 51 

30 | | | | | | 

AGCAACGACG CCGGGCAGCG GGAGCGGCGG CCGCGCCATG TGGCTGCTGG GGCCGCTGTG 60 

CCTGCTGCTG AGCAGCGCCG CGGAGAGCCA GCTGCTCCCC GGGAACAACT TCACCAATGA 120 

GTGCAACATA CCAGGCAACT TCATGTGCAG CAATGGACGG TGCATCCCGG GCGCCTGGCA 180 

GTGTGACGGG CTGCCTGACT GCTTCGACAA GAGTGATGAG AAGGAGTGCC CCAAGGCTAA 240 

35 GTCGAAATGT GGCCCAACCT TCTTCCCCTG TGCCAGCGGC ATCCATTGCA TCATTGGTCG 300 

CTTCCGGTGC AATGGGTTTG AGGACTGTCC CGATGGCAGC GATGAAGAGA ACTGCACAGC 360 

AAACCCTCTG CTTTGCTCCA CCGCCCGCTA CCACTGCAAG AACGGCCTCT GTATTGACAA 420 

GAGCTTCATC TGCGATGGAC AGAATAACTG TCAAGACAAC AGTGATGAGG AAAGCTGTGA 480 

. - AAGTTCTCAA GAACCCGGCA GTGGGCAGGT GTTTGTGACT TCAGAGAACC AACTTGTGTA 540 

40 TTACCCCAGC ATCACCTATG CCATCATCGG CAGCTCCGTC ATTTTTGTGC TGGTGGTGGC 600 

CCTGCTGGCA CTGGTCTTGC ACCACCAGCG GAAGCGGAAC AACCTCATGA CGCTGCCCGT 660 

GCACCGGCTG CAGCACCCTG TGCTGCTGTC CCGCCTGGTG GTCCTGGACC ACCCCCACCA 720 

CTGCAACGTC ACCTACAACG TCAATAATGG CATCCAGTAT GTGGCCAGCC AGGCGGAGCA 780 

GAATGCGTCG GAAGTAGGCT CCCCACCCTC CTACTCCGAG GCCTTGCTGG ACCAGAGGCC 840 

45 TGCGTGGTAT GACCTTCCTC CACCGCCCTA CTCTTCTGAC ACGGAATCTC TGAACCAAGC 900 

CGACCTGCCC CCCTACCGCT CCCGGTCCGG GAGTGCCAAC AGTGCCAGCT CCCAGGCAGC 960 

CAGCAGCCTC CTGAGCGTGG AAGACACCAG CCACAGCCCG GGGCAGCCTG GCCCCCAGGA 1020 

GGGCACTGCT GAGCCCAGGG ACTCTGAGCC CAGCCAGGGC ACTGAAGAAG TATAAGTCCC 1080 

CA AGTTATTCCA AAGTCCATAT GGGTTAATCT GCTCTGACTT GTTGCCATTC TAACAATTTG 1140 

50 TGCTCATGGG AAGCTCTTTA AGCACCTGTA AGGATGTCTC AAGTTACAGT TTGGGATATT 1200 

AACTATCTCT GCATTCCCCT CCTCCCCCAG ACTTCAGAGA TGTTTTTCTG GCGTCTCAGT 1260 

TGACATGATC TGTTGTGCGT CTTTTCTGTC AGGTCACTCT TCCCTTGGGA COCGAGATCA 1320 

CACCCTCATT TTTCACATTA TTCTGTTTCT GTTGGAGAGA CAGCATATAA AACAGTATTG 1380 

AAATAGGCTG GGAGAGAGCA ATGTTTCTGT GCTATATTGG ATGCTCAGAA GTGCAGGAGA 1440 

55 CGCTGGACCC AATTCTCTCT GCTGGGTAGT TACCTTATAG CATTTGGGGA TTTGGGTTAG 1500 

ATGATCTAAC CAGGAGGCCA TCACTGGATG GTCACCCCCC CAAAAAAATT CCATTTGAGC 1560 

ATCAAAACCT GCTTTGCACA ATCCTATTTG ATGCCCCCAG TTCAGCAGAG TCAGTGGCCA 1620 

AAGAAAACTT TGGACGTGAG TAACACCCTT CAGCAGTCGC AACGTTATTT TGGTTTTGTG 1680 

AAGGACTCTG AAACCATCTA CCCTGTATAA ATTCTGGCTT TAGAAATTTG CCCAAGAATG 1740 

60 CTCATTCTGA GAGCTTTCCT CAGCAGCATA TATCATCAGC CTCATCCTAA AATAGGCAGG 1800 

GAGCCCCTCC CATGAGTTTA TCCAAGTTCT CAGCTCCTAA AATGCAGGCT GCCAAGACCC 1860 

TACACCTGCC CTGGCTCTAC AGCCACTTAC CTGGTTTCTG GACTGTCACC CTCCCAGCTG 1920 

ACCTGCCCGT AGCCAAGGAA TGAGGACCTA ACTTGAGTTG GCCCAAAGTC TGACCTGGCT 1980 

GTATGTCCCT GTGGCCCACA CCCAGCCTGT CTTGCTCATT CATGCAGCCT CAACACTGGC 2040 

65 CTCCAAAGTT CCCTTAACAC TTGCAAAGTC CTTTTTACCT GTGCATTTGG ACTTGAGGAC 2100 

ACTGGTTTCT ATCACAGGTG AGAGCCATGT TCAATACCTC CAGCAAGCTC TCCTGGCTCC 2160 

CTGCACTGTG CACGCTCCTC TTCCCAAGGT CCCAATACCA GCACCTCTAG TTAGAGTTAG 2220 

GGTCAGGGTC AGGCCTCTCC CAACATCCCA GTAGTTTCTC CTCTGAGACA CATGGGCAAG 2280 

_^ AGACAATTTG GAGTCAAGAT TTTCCATTTG GATCTATTTT AAATCTTTTA GAAATGCATT 2340 

70 TGAAACAGTG TGTTTGTTTT TTCCCTTCTA GTTAAGGGAC TATTTATATG TGTATAGGAA 2400 

AGCTGTCTCT TTTTTTGTTT TTCCTTTAAC AAGGTCCAAA GAAAGATGCA AAAGGAGATC 2460 

ACACCCTTGC CCCGCTGAGC CCCGTGATAA CAAGTCACTC CAGACTAACC TGTGTGCCAG 2520 

ACATTTGTGC ATTGTTGC AC TTTGAGGTTA TTATTTATCA AGTTCTTGAA GGAAGCAGAA 2580 

AGAGGGACTC CTCTCTCCCT CCGTGTATAG TCTCTATGTT TGTGCTAGTT TTTCTTTTTT 2640 

75 TTCTCTGTGT CCAGTCAGCC ACAGGGCCCG CCTCCCTGCA GGAATAAGGG GTAAAACGTT 2700 

AGGTGTTGTT TGGCAAGAAA CCACACTGAC TGATGAGGGG TAAAATGGAA CCAGGTAGAG 2760 

CCACTCCGGG CAGCTGTCAC CCATTCAGAA CTTCTTTCCG CAGCTGAAGA AATGTTCAGT 2820 

AACCTGTTTG ACGCTAATTA AAACAGAGCC TGCAGGAAGT GGGGCTAAAG TGGCATTCAG 2880 

TGATCCTGTT CTGTAGACTT TTCTTTCTTT TTTTAACCAA ATCCAAAGGA TGTTACAGAA 2940 
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AAGCTAGCCA CTGGTATTTT GTTTTGTTTA AAAAAAAAAA GAAAGAAAGA AAGAAAGAAA 3000 

AACGGAAAGG AACCTAGCTG CCTGTATCTT TCATTTTTAA AATAGCACTT GAGTTATTTT 3060 

CTGAGTAATC CAATAAAGAA CTTTTGATGA CAGCCAGAAT GTGTTAGAAC TCTGGCTGAA 3120 

CATTTCATCT CCTGTGAGTC AGAAGGGCTT TATTTCTCCC TTTGATGGGG CCCCTTCTTC 3180 

TTTCTGGTGC TCTGGAAGTT GTTTAGAGGA AAGAATTCTA ATTTTAATTA ATTGCGCAGT 3240 

GAGTTAATCT CACTCGCTTT TCTGCTTCCA GGCATCTTAG GAAAAACAAA TGGTTTTAGT 3300 

AGATAAGGGA TGCCTACTAA TGCTTTTTTA AAACAAACAG GGACATTTTT ATTATAGATT 3360 

TGATTTTTTT AATGAATGTT TTTAAAAATA TATAAATAGG ACACCAAAGC GGCAGGGTTT 3420 

TTTTTGGGGG GAGGGGGTTT GTTTTCCAAC TCAAGATGGC ACATTAGTGG CCAGCAATAT 3480 

TTTTTAACTC ATTCCAACCA GGAAGCTTTT TTATACATTG CCTAAATCTA CGCCAACCAG 3540 

AAAATAGTCT CATCTCTTTT TTTCTCAAAT GAGATCCGTG TTTTATTTTA GCATTAAATT 3600 

AGTTACACTG TGATGACTGG CCTATTACCT GACTCAGCTC CCTCTACCTT GAAATTGACA 3660 

TTTTTAAAAA ATGCAACTAA GTGGTTAATA GTGTGTGACG CTCAAAGTTA ATGTAAACTG 3720 

GAAAGGTTGT GTGTCGTTGC TTTTTGTGTT TTGGTTAGGC TTGGTTTTGT TTTTTAATTT 3780 

TTATACTTTC TAATAAATTT GCAGTTTCAT TCTTTCTGTT TGTGCAAAWG GWMCTAMARM 3840 

AAMMAAAAAC AWYWTTGGGG GGGCTTGGGC CTCGGAAAAA GTTTTTAACA CCACTTCGGG 3900 

TGGGGCGGCG GGGCCCACGT AGGTACGGCG ACCACGCGGG CCCAAACGGG ACCCCAGAAG 3960 

GAAACCCTGG CCAAGAAAAA GGTGGCGAGA ATTCTCCACA CCAGAAAAAA ACGCGCCGGG 4020 

GGAAACCGCA GAGTGTTGCG TAAACCACAC CCGAAGAGAG AACTCAGAAG CACACAAGCG 4080 
GGACTCAACC AGGAGGACCC AAGGGAACCC GATAGAGTAC G 



$EQ IP NQ:17t Pt?Q7 Profon sequence: 

Protein Accession #: none found 



1 11 21 31 41 51 

I I ( i I 1 

MWLLGPLCLL LSSAAESQLL PGNNFTNECN IPGNFMCSNG RCIPGAWQCD GLPDCFDKSD 60 
EKECPKAKSK CGPTFFPCAS GIHCIIGRFR CNGFEDCPDG SDEENCTANP LLCSTARYHC 120 
KNGLCIDKSF ICDGQNNCQD NSDEESCESS QEPGSGQVFV TSENQLVYYP SITYAIIGSS 180 
VIFVLWALL ALVLHHQRKR NNLMTLPVHR LQHPVLLSRL WLDHPHHCN VTYNVNNGIQ 240 
YVASQAEQNA SEVGSPPSYS EAI/LDQRPAW YDLPPPPYSS DTESLNQADL PPYRSRSGSA 300 
NSASSQAASS LLSVEDTSHS PGQPGPQEGT AEPRDSEPSQ GTEEV 



SEQ ID N0:172 PEL3 DNA SEQUENCE 
Nucleic Acid Accession #: NM_005656.1 

Coding sequence: 57-1535 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I I i t I I 

GTCATATTGA ACATTCCAGA TACCTATCAT TACTCGATGC TGTTGATAAC AGCAAGATGG 60 

CTTTGAACTC AGGGTCACCA CCAGCTATTG GACCTTACTA TGAAAACCAT GGATACCAAC 120 

CGGAAAACCC CTATCCCGCA CAGCCCACTG TGGTCCCCAC TGTCTACGAG GTGCATCCGG 180 

CTCAGTACTA CCCGTCCCCC GTGCCCCAGT ACGCCCCGAG GGTCCTGACG CAGGCTTCCA 240 

ACCCCGTCGT CTGCACGCAG CCCAAATCCC CATCCGGGAC AGTGTGCACC TCAAAGACTA 300 

AGAAAGCACT GTGCATCACC TTGACCCTGG GGACCTTCCT CGTGGGAGCT GCGCTGGCCG 360 

CTGGCCTACT CTGGAAGTTC ATGGGCAGCA AGTGCTCCAA CTCTGGGATA GAGTGCGACT 420 

CCTCAGGTAC CTGCATCAAC CCCTCTAACT GGTGTGATGG CGTGTCACAC TGCCCCGGCG 480 

GGGAGGACGA GAATCGGTGT GTTCGCCTCT ACGGACCAAA CTTCATCCTT CAGATGTACT 540 

CATCTCAGAG GAAGTCCTGG CACCCTGTGT GCCAAGACGA CTGGAACGAG AACTACGGGC 600 

GGGCGGCCTG CAGGGACATG GGCTATAAGA ATAATTTTTA CTCTAGCCAA GGAATAGTGG 660 

ATGACAGCGG ATCCACCAGC TTTATGAAAC TGAACACAAG TGCCGGCAAT GTCGATATCT 720 

ATAAAAAACT GTACCACAGT GATGCCTGTT CTTCAAAAGC AGTGGTTTCT TTACGCTGTT 780 

TAGCCTGCGG GGTCAACTTG AACTCAAGCC GCCAGAGCAG GATCGTGGGC GGTGAGAGCG 840 

CGCTCCCGGG GGCCTGGCCC TGGCAGGTCA GCCTGCACGT CCAGAACGTC CACGTGTGCG 900 

GAGGCTCCAT CATCACCCCC GAGTGGATCG TGACAGCCGC CCACTGCGTG GAAAAACCTC 960 

TTAACAATCC ATGGCATTGG AC GGCATTTG CGGGGATTTT GAG AC AATC T TTCATGTTCT 1020 

ATGGAGCCGG ATACCAAGTA CAAAAAGTGA TTTCTCATCC AAATTATGAC TCCAAGACCA 1080 

AGAACAATGA CATTGCGCTG ATGAAGCTGC AGAAGCCTCT GACTTTCAAC GACCTAGTGA 1140 

AACCAGTGTG TCTGCCCAAC CCAGGCATGA TGCTGCAGCC AGAACAGCTC TGCTGGATTT 1200 

CCGGGTGGGG GGCCACCGAG GAGAAAGGGA AGACCTCAGA AGTGCTGAAC GCTGCCAAGG 1260 

TGCTTCTCAT TGAGACACAG AGATGCAACA GCAGATATGT CTATGACAAC CTGATCACAC 1320 

CAGCCATGAT CTGTGCCGGC TTCCTGCAGG GGAACGTCGA TTCTTGCCAG GGTGACAGTG 1380 

GAGGGCCTCT GGTCACTTCG AACAACAATA TCTGGTGGCT GATAGGGGAT ACAAGCTGGG 1440 

GTTCTGGCTG TGCCAAAGCT TACAGACCAG GAGTGTACGG GAATGTGATG GTATTCACGG 1500 

ACTGGATTTA TCGACAAATG AAGGCAAACG GCTAATCCAC ATGGTCTTCG TCCTTGACGT 1560 

CGTTTTACAA GAAAACAATG GGGCTGGTTT TGCTTCCCCG TGCATGATTT AC TCTT AG AG 1620 

ATGATTCAGA GGTCACTTCA TTTTTATTAA ACAGTGAACT TGTCTGGCTT TGGCACTCTC 1680 

TGCCATACTG TGCAGGCTGC AGTGGCTCCC CTGCCCAGCC TGCTCTCCCT AACCCCTTGT 1740 

CCGCAAGGGG TGATGGCCGG CTGGTTGTGG GCACTGGCGG TCAATTGTGG AAGGAAGAGG 1800 

GTTGGAGGCT GCCCCCATTG AGATCTTCCT GCTGAGTCCT TTCCAGGGGC CAATTTTGGA 1860 

TGAGCATGGA GCTGTCACTT CTCAGCTGCT GGATGACTTG AGATGAAAAA GGAGAGACAT 1920 

GGAAAGGGAG ACAGCCAGGT GGCACCTGCA GCGGCTGCCC TCTGGGGCCA CTTGGTAGTG 1980 

TCCCCAGCCT ACTTCACAAG GGGATTTTGC TGATGGGTTC TTAGAGCCTT AGCAGCCCTG 2040 

GATGGTGGCC AGAAATAAAG GGACCAGCCC TTCATGGGTG GTGACGTGGT AGTCACTTGT 2100 

AAGGGGAACA GAAACATTTT TGTTCTTATG GGGTGAGAAT ATAGACAGTG CCCTTGGTGC 2160 
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GAGGGAAGCA ATTGAAAAGG AACTTGCCCT GAGCACTCCT GGTGCAGGTC TCCACCTGCA 2220 

CATTGGGTGG GGCTCCTGGG AGGGAGACTC AGCCTTCCTC CTCATCCTCC CTGACCCTGC 2280 

TCCTAGCACC CTGGAGAGTG AATGCCCCTT GGTCCCTGGC AGGGCGCCAA GTTTGGCACC 2340 

ATGTCGGCCT CTTCAGGCCT GATAGTCATT GGAAATTGAG GTCCATGGGG GAAATCAAGG 2400 

ATGCTCAGTT TAAGGTACAC TGTTTCCATG TTATGTTTCT ACACATTGAT GGTGGTGACC 2460 
CTGAGTTCAA AGCCATCTT 



10 



SEQ ID NO:173 PEL3 Prolan sequence: 
Protem Accession*: 



NP_005647.1 



15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



MALNSGSPPA 
SNPWCTQPK 
DSSGTCINPS 
GRAACRDMGY 
CLACGVNLNS 
PLNNPWHWTA 
VKPVCLPNPG 
TPAMICAGFL 
TDWIYRQMKA 



11 

I 

IGPYYENHGY 
SPSGTVCTSK 
NWCDGVSHCP 
KNNFYSSQGI 
SRQSRIVGGE 
FAGILRQSFM 
MMLQPEQLCW 
QGNVDSCQGD 
NG 



21 
I 

QPENPYPAQP 
TKKALCITLT 
GGEDENRCVR 
VDDSGSTSFM 
SALPGAWPWQ 
FYGAGYOVQK 
ISGWGATEEK 
SGGPLVTSNN 



31 

I 

TWPTVYEVH 
LGTFLVGAAL 
LYGPNFILQM 
KLNTSAGNVD 
VSLHVQNVHV 
VISHPNYDSK 
GKTSEVLNAA 
NIWWLIGDTS 



41 

1 

PAQYYPSPVP 
AAGLLWKFMG 
YSSQRKSWHP 
IYKKLYHSDA 
CGGSI ITPEW 
TKNNDIALMK 
KVLLIETQRC 
WGSGCAKAYR 



51 
I 

QYAPRVLTQA 
SKCSNSGIEC 
VCQDDWNENY 
CSSKAWSLR 
IVTAAHCVEK 
LQKPLTFNDL 
NSRYVYDNLI 
PGVYGNVMVF 



60 
120 
180 
240 
300 
360 
420 
480 



Nucleic Acid Accession #: 



SEQ ID MO:174 PBJ4 DNA SEQUENCE 



AI694767 
Coding sequence: 



130*1 086 {underlined sequences correspond to start and stop codons) 



CAGAGAGGCT 
GGGGTCACAC 
AGCTTCTTCA 
ATAGGCCTCC 
TACCTTATTG 
CTGCATGAGC 
ACCTCATCCA 
GATGCTTGTC 
CTGCTGGCCA 
GTACTTACGT 
CTGATGGCAC 
TCCCATTCCT 
AATGTCGTCT 
TCCTTCTCAT 
AAGGCATTTG 
ATTGGATTGT 
TTGGCCAATA 
ACAAAGGAGA 
CCCTAGGTGT 
GTTAACATTT 
ATCCTTCAAA 
GTTTTCTTGC 
TTTTCATTTT 
GAGATAAGAA 
TAAACACAGA 
ACTCCCAACC 
AAATAATTTT 
AGAGTACATT 
ATGGACCCTG 
TTAGTACCCT 
GGGGTCATAC 
GGAAGAACTG 
TTCTARAGGA 
GCAACAGAAC 
AATTACCTGT 
AGAAAGTCTG 
TGATAGGCAG 
TGAAGATAAC 
ACCATGCTTT 
ATCTGACTTA 
ATAGGTTTCA 
TACTAAAACA 
CCTGATATGG 
AATGCCTATT 
TATTGAATGT 
AAAGTGCCTA 
TTCCTTCTGT 
TTAAATTTTA 
GCTCATAAAA 



11 

I 

GTATTTCAGT 
ATTCCTTCCA 
TGATGGTGGA 
CTGGTTTAGA 
CTGTGCTAGG 
CCATGTATAT 
TGCCCAAAAT 
TGCTACAGAT 
TGGCTTTTGA 
TGCCTCGTGT 
CCCTTCCTGT 
ACTGCCTACA 
ATGGCCTTAT 
ATCTGCTTAT 
GCACTTGCGT 
CCATGGTGCA 
TCTATCTGCT 
TTCGACAGCG 
CAGTGATCAA 
TGGAAGACAG 
TATGAAACTG 
TACATATAAT 
ACCATGCAGT 
TGGTACATCT 
ATATAATAAA 
ACATTGGATC 
TCCTCTGGAC 
TACCTACGTT 
TTTTTCCTAT 
CATTGTAGCC 
AAGTATAAAA 
TTAAAGAGAC 
GGTATTTAAT 
TCATGGCTTT 

GTCTTGGAAG 
CATAGGGCTT 
TGAGGTTAGG 
ATTGGCCTTT 
ATTTGGGGCT 
GGCATGGGAA 
TCTTCAACAG 
TGTGATCATA 
ATTCCTATNA 
TAATACTTGT 
CATCTCTGTT 
GAACATAATA 
GCTGAACACA 
GCCATTACTT 
CCCTCCCATG 



21 



GCAGCCTGCC 
TACGGTTGAG 
TCCCAATGGC 
AGAGGCTCAG 
TAACTTGACA 
ATTTCTTTGC 
GCTGGCCATC 
GTTTGCCATC 
CCGCTATGTG 
CACCAAAATT 
CTTCATCAAG 
CCAAGATGTC 
CGTCATCATC 
TCTTAAGACT 
CTCTCATGTG 
TCGCTTTAGC 
GGTTCCTCCT 
CATCCTTCGA 
ACTTCTTTTC 
TATTCAGAAA 
GTTGGGGAAT 
TATTAATACC 
CCAAATCTAA 
AGAGAACATT 
ATGAGATAAT 
TCAGAAAAAT 
ACTAGCACTT 
AATGAAAGTT 
TTAATTTTCT 
ATGGGAAAAT 
ATTAAAAAAA 
CAACAGGGTA 
TTCTTCTCAC 
AATCCCACTA 
AAGTGATTTC 
ATAGCAAGTT 
GAGCCACCAG 
TGAGTGTGAC 
TTGTGCAGTA 
TCAGGCATTT 
GATATGACAA 
TATGTGGTAA 
CATGCTTTCA 
ATTTGCTGCT 
CATCATTGAC 
GTGCTTATGC 
TAGCCAGGCA 
CCAATGTGAG 
TGCAGCCTTT 



31 
I 

AGACCTCTTC 
CCTCTACCTG 
AATGAATCCA 
TTCTGGTTGG 
ATCATCTACA 
ATGCTTTCAG 
TTCTGGTTCA 
CACTCCTTAT 
GCCATCTGTC 
GGTGTGGCTG 
CAGCTGCCCT 
ATGAAGCTGG 
TCCGCCATTG 
GTGTTGGGCT 
TGTGCTGTGT 
AAGCGGCGTG 
GTGCTCAACC 
CTTTTCCATG 
CATTCAGAGT 
AAAAATTTCC 
CTCCATTTTT 
CTGACTAGGT 
ACTGCTTCTA 
TGCCAAAGGC 
CTAGCTTAAA 
ACTGTCTTCA 
AAGGGGAAGA 
GACACACTGT 
TATCAACCCT 
TGATGTTCAG 
AAAGACTTCA 
GTGGGTTAGA 
TCATCCAGTG 
GCTATTGCTT 
TAGGTTCACC 
ATTTATTTTT 
TTATGATGGG 
TCGTAGCTGG 
TGGAACAGGG 
TTGCTTCTGA 
CAGTCTTAAC 
GTTTCATTTT 
TCCCCTTTTG 
GGACTGTAAG 
TGCTCTTTGC 
TTGACACCGG 
ATTTTCCAGC 
TGGAAGTGAC 
CATGTTGACA 



41 
I 

TGGAGGAAGA 
CCTGGTGCTG 
GTGCTACATA 
CCTTCCCATT 
TTGTGCGGAC 
GCATTGACAT 
ATTCCACTAC 
CTGGCATGGA 
ACCCACTGCG 
CTGTGGTGCG 
TCTGCCGCTC 
CCTGTGATGA 
GCCTGGACTC 
TGACACGTGA 
TCATATTCTA 
ACTCTCCACT 
CAATTGTCTA 
TGGCCACACA 
CCTCTGATTC 
TTAATAAAAA 
TCAATATTAT 
TGTGGTTGGA 
CTGATGGTTT 
CTAAGCACAG 
ACTATAACTT 
AAATGACTTC 
TTGGAAGTAA 
TCTGAGAGTT 
TTAATTAGGC 
TGGGGATCAG 
TGCCCAATCT 
GATTTCCAGA 
TTGTATTTAG 
ATTGTCCTGG 
ATTATGGAAG 
AAAAGTTCCA 
AAGTATGGAA 
AAAGTGAGGG 
ACTTTGAGAC 
GGGGCTATTA 
CAAGAAACTC 
CTTTTTCAAT 
TAATGGATAT 
CCCATGAGGG 
TCATCATTGA 
TTATTTTTCA 
CTTCTTTGAG 
ATGTGC AATT 
TTAAATGTGA 



51 



CTGGACAAAG 
GTCACAGTTC 
CTTCATCCTA 
GTGCTCCCTC 
TGAGCACAGC 
CCTCATCTCC 
CATCCAGTTT 
ATCCACAGTG 
CCATGCCACA 
GGGGGCTGCA 
CAATATCCTT 
TATCCGGGTC 
ACTTCTCATC 
AGCCCAGGCC 
TGTACCTTTC 
GCCCGTCATC 
TGGAGTGAAG 
CGCTTCAGAG 
AGATTTTAAT 
TACAACTCAG 

GGGTTATTAC 
AC AGCATTC T 
CAAAGGAAAA 
CCTCTTCAGA 
TACAGAGAAG 
AGCCTTGAAA 
TTCACAGCAT 
AAAGATATTA 
TGAATTAAAT 
CATATGATGT 
GTCTTACATT 
GAATTTCCTG 
TCCAATTGCC 
ATTCTTATTC 
TAGGTGTTTC 
TGGCAGGTGT 
AATCTTCAGG 
CGGGAAAGCA 
CCAAGGGTTA 
AAATTACATA 
CCTCAGGTTC 
CATATTTGGA 
CACTGTTTAT 
ATCCCCCAGC 
TCAAACCTGA 
TTGGG T ATT A 
TTTATACCTG 
CTTGGGAAGC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 



372 



Attorney Docket No.: 018501-004200US 



5 
10 

15 
20 



TATGTGTTAC ACAGAGTTAA TTAACCNGAA AGGCCTGGNA ATTTTTTGNK AANNAAACTG 3000 
TGGCCNNGAG GCCCNCAACC CTTTTTNNNA ATTTGGCAAN NTCCCACTTT GTANTTTGGT 3060 
AAGGAGGCCA GTTGGATAAG TGAAAAATAA AGTACTATTG TGTC 



Protein Accession #: 



SEQ ID NO:175 PBJ4 PROTEIN SEQUENCE 

not available, cloned at Eos 



11 



21 



31 



41 



51 



MVDPNGNESS 
MYIFLCMLSG 
AFDRYVAICH 
CLHQDVMKLA 
TCVSHVCAVP 
RQRILRLFKV 



ATYFILIGLP GLEEAQFWLA 
IDILISTSSM PKMLAIFWFN 
PLRHATVLTL PRVTKIGVAA 
CDDIRVNWY GLIVIISAIG 
IFYVPFIGLS MVHRFSKRRD 
ATHASEP 



FPLCSLYLIA VLGNLTIIYI VRTEHSLHEP 
STTIQFDACL LQMFAIHSLS GMESTVLLAM 
WRGAALMAP LPVFIKQLPF CRSNILSHSY 
LDSLLISFSY LLILKTVLGL TREAQAKAFG 
SPLPVILANI YLLVPPVLNP IVYGVKTKEI 



60 
120 
180 
240 
300 



Nucleic Acid Accession #: 
Coding sequence: 



SEQ ID NO:176 PM72 DNA SEQUENCE 
NM_004624.1 

57-1544 (underlined sequences correspond to start and stop codons) 



25 
30 
35 
40 
45 
50 
55 
60 
65 
70 
75 
80 



TCGGAGCCTG 
CTCCTCCTCC 
TGGTGGTCGC 
GCGGCGGCGG 
CGCTCTTGGG 
ACAAGCAGTG 
GGGACAACCT 
CCCTCATCTT 
ACGAAGGCTG 
AGGCAGCGAG 
CCATTGGCTA 
TCAGGAAGCT 
TGAGGGCTGC 
AGTGCTCCGA 
TGGCTAACTT 
CCTTCTTCTC 
GCACATTCAC 
GGTGCTGGGA 
CCATCTTGGT 
GGCCCCCAGA 
TCCTGCTGAT 
TTAAGCCTGA 
TGGCTATCCT 
GGCGCTGGCA 
GCAGCAACGG 
CCCGCCGCTC 
CCAAGCGGCC 
GGGCGCGCCA 
GGACACTCCT 
GATGGGAGCT 
AGGCCCCCTA 
TGCTGGCTCT 
TGACCTGAGG 
CCTGAAATTT 
GACTGAAGAT 
GTGGGTTATT 
GTGGACTGGC 
CTGAAGCCTC 
TACCTGCTCT 
TTCTTATCTC 
CACCTATGTG 
AAGCAGATCC 
GTGAAAGCAC 
TTATTTGTTT 
CCCTCCCTGG 
CTGGTCACAG 
CCTCTGCCAG 
GGAAAAAAAA 



CGGAGGGTGG 
TCTGCTCTCG 
GGCGGCCGGG 
CCGAGGTGGG 
CTCCTCGCTG 
CCTGGAGGAG 
CACCTGCTGG 
CAAGCTCTTC 
GACGCACCTG 
TTTGGATGAG 
CGGCCTGTCC 
CCACTGCACG 
CGCTGTCTTC 
GGGCTCGGTG 
CTTCTGGCTG 
TGAGCGGAAG 
CATGGTGTGG 
CACCATCAAC 
AAACTTCATC 
TATCAGGAAG 
CCCCCTGTTT 
AGTGAAGATG 
CTACTGCTTC 
CCTGCAGGGC 
CGCCACGTGC 
CTCCAGCTTC 
CCTCCCGCCC 
GCCCCGGCCC 
AGAGAACGCA 
CCTCTCCTGG 
CGCCAATCAA 
TCTGCCCAAT 
GCAGAAAGGT 
CACCATTGCT 
GCAGCTCACT 
CTGGAGTTTT 
CCCTGGGTCA 
TGGGAAATGA 
CCAAGTCTCA 
TCTGTGCTGT 
CCAACTGTTG 
TCACCCTGCT 
GG AC TC TT AC 
ACCACTTGTA 
AGTGTGGCTG 
CCTCCTCTGT 
AAGATCCCCT 
AAAA 



TGGTGGTGGT 
CTCAGGCGCC 
GCTCGCTCTC 
GTCGCGCGGC 
CAGGAGGAGT 
GCCCAGCTGG 
CCAGCCACCC 
TCCTCCATTC 
GAGCCTGGCC 
CAGCAGACCA 
CTCGCCACCC 
CGGAACTACA 
ATCAAAGACT 
GGCTGTAAGG 
CTGGTGGAGG 
TACTTCTGGG 
ACCATCGCCA 
TCCTCACTGT 
CTGTTTATTT 
AGTGACAGCA 
GGAGTACACT 
GTCTTTGAGC 
CTCAATGGTG 
GTCCTGGGCT 
AGCACGCAGG 
CAAGCCGAAG 
CTTCCCACTC 
TGGGCTCGGA 
GCCCTAGAGC 
AGGATGCAGG 
GGGCAAAAAG 
TGGAGGAAAG 
TCTGCCCGGG 
GTCAAGTTCC 
ACCCTATTCT 
TGTTTGGAGA 
GTCTGGTGGG 
GAAGGCAGCC 
GTGGCTTCAT 
GGAAGCAACA 
TAACTAGGCT 
ACACATACAG 
TGCTAACTTT 
TTATTAATGC 
AGGAGGCCTC 
CTGCCCTTCA 
CAGGACTGCA 



GGTGGTGGCC 
TCGGTGGCGG 
GGGGAGGCCG 
GGAGGCGGCT 
GTGACTATGT 
AGAATGAGAC 
CTCGGGGCCA 
AAGGCCGCAA 
CGTACCCCAT 
TGTTCTACGG 
TTCTGGTCGC 
TCCACATGCA 
TGGCCCTCTT 
CAGCCATGGT 
GCCTCTACCT 
GGTACATACT 
GGATCCATTT 
GGTGGATCAT 
GCATCATCCG 
GTCCATACTC 
ACATCATGTT 
TCGTCGTGGG 
AGGTGCAGGC 
GGAACCCCAA 
TTTCCATGCT 
TCTCCCTGGT 
GCAGCAGACG 
GGCTGCCCCC 
CTGCCTGGAG 
TGGAACTCAG 
TCTACATACT 
CAACCGGTGG 
AAGGTCACCA 
TTTGGGTTAA 
CTCTTTACGC 
GCACACCTAT 
AGGACGGTGC 
ACCAGCGAAT 
CTGTCAAGTG 
GGAATCAAGA 
CAGAGATGTG 
GATTTGAACT 
TGTGTATCGT 
CATTATCCCT 
CATCTCATGT 
CCCCAGTGGC 
ACAGGCTTGT 



CTCGCCCGCC 
TTGGTCGGCG 
GGGCGGATCT 
CGAGCTTCGT 
GCAGATGATC 
AATAGGCTGC 
GGTAGTTGTC 
TGTAAGCCGC 
TGCCTGTGGT 
TTCTGTGAAG 
CACAGCTATC 
CCTCTTCATA 
CGACAGCGGG 
CTTTTTCCAA 
GTACACCCTG 
CATCGGCTGG 
TGAGGATTAT 
AAAGGGCCCC 
AATCCTGCTT 
AAGGCTAGCC 
CGCCTTCTTT 
GTCTTTCCAG 
GGAGCTGAGG 
ATACCGGCAC 
GACCCGCGTC 
CTGACCACCA 
CCGGGGACAG 
GGCCCCCTGG 
CGTTTCTAGC 
TCATTAGACT 
TTCATCCTGA 
ATCCTCAAAC 
GCACCAACAC 
GCATTACCAC 
TTAGTTATCA 
CTTAGTGGTT 
AACCCAAGGA 
GCTAGGTCTC 
GGACTCTGTC 
GACTGCCCTC 
CACCCATGGG 
CAGATCTGTC 
AACCAGCCAG 
GAATTCCCCT 
ATCATCTGGA 
CACTCAGCTT 
GCAACAATAA 



TCACTCATGC 
GTTACGCGGC 
CGCGGCGCAG 
GCTGCGCGCT 
GAGGTGCAGC 
AGCAAGATGT 
TTGGCCTGTC 
AGCTGCACCG 
TTGGATGACA 
ACCGGCTACA 
CTGAGCCTGT 
TCCTTCATCC 
GAGTCGGACC 
TATTGTGTCA 
CTTGCCGTCT 
GGGGTACCCA 
GGTCTGCTCA 
ATCCTCACCT 
CAGAAACTGC 
AGGTCCACAC 
CCGGACAATT 
GGTTTTGTGG 
CGGAAGTGGC 
CCGTCGGGAG 
AGCCCAGGTG 
GGATCCCAGC 
AGGCCTGCCC 
TCTCTGGTCC 
AAGTGAGAGA 
CCTCCTCCAA 
CTCTGCCCCC 
AACACTGGTG 
CACGGTAGTG 
TCAGGCATTT 
GCTTTTTAAA 
CCCCACCGAA 
CTGAGGGACT 
GGACTAAGCC 
ACACCAGCCA 
CTTGTCCACC 
CTCTGACAGA 
TGATAGGAAT 
ATCCTCTTGG 
TGCCACCCCA 
TAGGAGCCTG 
CCTACCCACA 
ATGTTGGCTT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 



SEQ ID NO:177 PM72 Protein sequence: 
Proton Accesswn #: 



JC2195 



1 11 21 31 41 51 

i t I i 1 I 

MPPPPLLSLR RLGGGWSAVT RLWAAAGAR SRGGRGGSRG AGGGGRGGVA RRRRLELRAA 

RSLLGSSLQE ECDYVQMIEV QHKQCLEEAQ LENETIGCSK MWDNLTCWPA TPRGQVWLA 



60 
120 
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CPLIFKLFSS IQGRNVSRSC TDEGWTHLEP GPYPIACGLD DKAASLDEQQ TMFYGSVKTG 180 
YTIGYGLSLA TLLVATAILS LFRKLHCTRN YIHMHLFISF ILRAAAVFIK DLALFDSGES 240 
DQCSEGSVGC KAAMVFFQYC VMANFFWLLV EGLYLYTLLA VSFFSERKYF WGYILIGWGV 300 
PSTFTMVWTI ARIHFEDYGL LRCWDTINSS LWWIIKGPIL TSILVNFILF ICIIRILLQK 360 
5 LRPPDIRKSD SSPYSRLARS TLLLIPLFGV HYIMFAFFPD NFKPEVKMVF ELWGSFQGF 420 

WAILYCFLN GEVQAELRRK WRRWHLQGVL GWNPKYRHPS GGSNGATCST QVSMLTRVSP 480 
GARRSSSFQA EVSLV 

SEQ ID NO:178 BFF8 DNA SEQUENCE 

10 Nucleic Acid Accession*: AL 133619 

Coding sequence: 1 -2070 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

15 ATGAGCGGTG CGGGGGTGGC GGCTGGGACG CGGCCCCCCA GCTCGCCGAC CCCGGGCTCT 60 

CGGCGCCGGC GCCAGCGCCC CTCTGTGGGC GTCCAGTCCT TGAGGCCGCA GAGCCCGCAG 120 

CTCAGGCAGA GCGACCCGCA GAAACGGAAC CTGGACCTGG AGAAAAGCCT GCAGTTCCTG 180 

CAGCAGCAGC ACTCGGAGAT GCTGGCCAAG CTCCATGAGG AGATCGAGCA TCTGAAGCGG 240 

GAAAACAAGG GTGAGCCGGC GCGGGGCCCT AGGCCGGCCC TGCCTCCCCA GGCACACTCA 300 

20 ACACTGCCGC TCCCGCAGCA CAGAAACACA GCCATCAACT CCAGCACACG CCTGGGCTCA 360 

GGGGGAACAC AGGACGGGGA GCCCCTCCAG ACTGTCCTTG CCCACCTGGC TGCACTGGCC 420 

CCTGTATGCC AACCCAGTGG GTACAGGTTC TGGGGGACCT GGACAGATGC CGCTACCTCT 480 

AGCCGTGGCT GGACGATGTT ATGCAGCCAA GCACAGCACG TGCTGCTCTC GGGAAGCCCA 540 

GGGCCTGAGG TCATTGCAGG GCGGCAGGTG GCCACAGGGT GCTCCCCAGA CCTCCCTCCT 600 

rU5 CCAAGTAGAG CTGAAATGGG AAGGAACCCC TGGGACAGCC CCTGCCCTGC TAGATCTTTG 660 

P ~ CCTCAGATTG CTGCTGTGGC CAGGCCCAGG ATTTCCAGCC CTATGGCTCT GAGTCCTCAC 720 

h3 ATGCTGGGGG CCCAGGGGAT ATGGACACAC TCCATCCAGG GATCCCTTCC TGCCATCTGG 780 

^ GCAGCAACCA TGGGGACAAA GGGAGGAAGC AGAGTCCTGT TTC CTTGCC A CTTGTCCAAG 840 

^ GCACTTCCCC ATCCTGACAG CGGCCCCCAC CCAGCCCAGG ATCCTGGGCT GTGGTCTCAA 900 

SO GCTCACTTCC CATTATCTTT GGGGCTGGGG CTGACATCAG GAGGACATCT GACTGGTGGA 960 

•ca TGGAGCCAGC CTGGGAACAT CGCAGCTGGG GCAGTGCCTA GGGCTCTCCC TTCCCAGGGA 1020 

^ GACATGGAGA AGGGGGTTGA GGGAGGGCCC TTCCCTAGCC GCTGTGGCAA CTCCAGTGAG 1080 

^0 CTGTTCTGGG CAAAGTGTGG CCCAAGTCGG CAGCCCCAGC CCTGCAGTGC TGGGGACGCT 1140 

'f« GACAGGACAC GGGAAGAGGC CATGCTTTCC CTCGGGACCT GC TGTTC CAT GTGTCCCAAG 1200 

^B5 CCCTCCTGCT TTCCAGATGG CCCCTCAGGA AACCACCTTT CCAGGGCCTC TGCTCCCTTG 1260 

GGCGCTCGCT GGGTCTGCAT CAACGGAGTG TGGGTAGAGC CGGGAGGACC CAGCCCTGCC 1320 

^ AGGCTGAAGG AGGGCTCCTC ACGGACACAC AGGCCAGGAG GCAAGCGTGG GCGTCTTGCG 1380 

GGCGGTAGCG CCGACACTGT GCGCTCTCCT GCAGACAGCC TCTCCATGTC AAGCTTCCAG 1440 

TCTGTCAAGT CCATCTCTAA TTCAGCCAAC TCTCAAGGCA AGGCCAGGCC CCAGCCCGGC 1500 

""40 TCCTTCAACA AGCAAGATTC AAAAGC TGAC GTCTCCCAGA AGGCGGACCT GGAAGAGGAG 1560 

CCCCTACTTC ACAACAGCAA GCTGGACAAA GTTCCTGGGG TACAAGGGCA GGCCAGAAAG 1620 

?'"' GAGAAAGCAG AGGCCTCTAA TGCAGGAGCT GCCTGTATGG GGAACAGCCA GCACCAGGGC 1680 

AGGCAGATGG GGGCGGGGGC ACACCCCCCA ATGATCCTGC CCCTTCCCCT GCGAAAGCCC 1740 

ACCACACTTA GGCAGTGCGA AGTGCTCATC CGCGAGCTGT GGAATACCAA CCTCCTGCAG 1800 

ACCCAAGAGC TGCGGCACCT CAAGTCCCTC CTGGAAGGGA GCCAGAGGCC CCAGGCAGCC 1860 

^ CCGGAGGAAG CTAGCTTTCC CAGGGACCAA GAAGCCACGC ATTTCCCCAA GGTCTCCACC 1920 

AAGAGCCTCT CCAAGAAATG CCTGAGCCCA CCTGTGGCGG AGCGTGCCAT CCTGCCCGCA 1980 

CTGAAGCAGA CCCCGAAGAA CAACTTTGCC GAGAGGCAGA AGAGGCTGCA GGCAATGCAG 2040 
AAACGGCGCC TGCATCGCTC AGTGCTTTGA 

50 

SEQ ID NO:179 BFF8 Protem sequence: 

Protein Accession #: T43457 



55 1 11 21 31 41 51 

1 j I ! 1 I 

MSGAGVAAGT RPPSSPTPGS RRRRQRPSVG VQSLRPQSPQ LRQSDPQKRN LDLEKSLQFL 60 

QQQHSEMLAK LHEEIEHLKR ENKGEPARGP RPALPPOAHS TLPLPQHRNT AINSSTRLGS 120 

GGTQDGEPLQ TVLAHLAALA PVCQPSGYRF WGTWTDAATS SRGWTMLCSQ AQHVLLSGSP 180 

60 GPEVIAGRQV ATGCSPDLPP PSRAEMGRNP WDSPCPARSL PQIAAVARPR ISSPMALSPH 240 

MLGAQGIWTH SIQGSLPAIW AATMGTKGGS RVLFPCHLSK ALPHPDSGPH PAQDPGLWSQ 300 

AHFPLSLGLG LTSGGHLTGG WSQPGNIAAG AVPRALPSQG DMEKGVEGGP FPSRCGNSSE 360 

LFWAKCGFSR QPQPCSAGDA DRTREEAMLS LGTCCSMCPK PSCFPDGPSG NHLSRASAPL 420 

GARWVCINGV WVEPGGPSPA RLKEGSSRTH RPGGKRGRLA GGSADTVRSP ADSLSMSSFQ 480 

65 SVKSISNSAN SOGKARPQPG SFNKQDSKAD VSQKADLEEE PLLHNSKLDK VPGVQGQARK 540 

EKAEASNAGA ACMGNSQHOG RQMGAGAHPP MILPLPLRKF TTLRQCEVL I RELWNTNLLQ 600 

TQELRHLKSL LEGSQRPQAA PEEASFPRDQ EATHFPKVST KSLSKKCLSP PVAERAILPA 660 
LKQTPKNNFA ERQKRLQAMQ KRRLHRSVL 



70 



75 



SEQ ID NO:180 BCR4 DNA SEQUENCE 

Nucleic Acid Accession #: NM_012319.2 . ^ . t . . 

C^ng se ^ uence: 138-2405 (underlined sequences correspond to start and stop codons) 



60 



1 11 21 31 41 51 

I I I t I I 

CTCGTGCCGA ATTCGGCACG AGACCGCGTG TTCGCGCCTG GTAGAGATTT C TCGAAGAC A 
CCAGTGGGCC CGTGTGGAAC CAAACCTGCG CGCGTGGCCG GGCCGTGGGA CAACGAGGCC 120 

374 



GCGGAGACGA AGGCGCAATG GCGAGGAAGT TATCTGTAAT CTTGATCCTG ACCTTTGCCC 180 

TCTCTGTCAC AAATCCCCTT CATGAACTAA AAGCAGCTGC TTTCCCCCAG ACCACTGAGA 240 

AAATTAGTCC GAATTGGGAA TCTGGCATTA ATGTTGACTT GGCAATTTCC ACACGGCAAT 300 

ATCATCTACA ACAGCTTTTC TACCGCTATG GAGAAAATAA TTCTTTGTCA GTTGAAGGGT 360 

TCAGAAAATT ACTTCAAAAT ATAGGCATAG ATAAGATTAA AAGAATC CAT ATACACCATG 420 

ACCACGACCA TCACTCAGAC CACGAGCATC ACTCAGACCA TGAGCGTCAC TCAGACCATG 480 

AGCATCACTC AGACCACGAG CATCACTCTG ACCATGATCA TCACTCTCAC CATAATCATG 540 

CTGCTTCTGG TAAAAATAAG CGAAAAGCTC TTTGCCCAGA CCATGACTCA GATAGTTCAG 600 

GTAAAGATCC TAGAAACAGC CAGGGGAAAG GAGCTCACCG ACCAGAACAT GCCAGTGGTA 660 

GAAGGAATGT CAAGGACAGT GTTAGTGCTA GTGAAGTGAC CTCAACTGTG TACAACACTG 720 

TCTCTGAAGG AACTCACTTT CTAGAGACAA TAG AG AC TCC AAGACCTGGA AAACTCTTCC 780 
CCAAAGATGT AAGCAGCTGC ACTCCACCCA GTGTCACATC AAAGAGCCGG GTGAGCCGGC 840 
TGGCTGGTAG GAAAACAAAT GAATCTGTGA GTGAGCCCCG AAAAGGCTTT ATGTATTCCA 900 
GAAACACAAA TGAAAATCCT CAGGAGTGTT TCAATGCATC AAAGCTACTG ACATCTCATG 960 

GCATGGGCAT CCAGGTTCCG CTGAATGCAA CAGAGTTCAA CTATCTCTGT CCAGCCATCA 1020 

TCAACCAAAT TGATGCTAGA TCTTGTCTGA TTCATACAAG TGAAAAGAAG GCTGAAATCC 1080 

CTCCAAAGAC CTATTCATTA CAAATAGCCT GGGTTGGTGG TTTTATAGCC ATTTCCATCA 1140 

TCAGTTTCCT GTCTCTGCTG GGGGTTATCT TAGTGCCTCT CATGAATCGG GTGTTTTTCA 1200 

AATTTCTCCT GAGTTTCCTT GTGGCACTGG CCGTTGGGAC TTTGAGTGGT GATGCTTTTT 1260 

TACACCTTCT TCCACATTCT CATGCAAGTC ACCACCATAG TCATAGCCAT GAAGAACCAG 1320 

CAATGGAAAT GAAAAGAGGA CCACTTTTCA GTCATCTGTC TTCTCAAAAC ATAGAAGAAA 1380 

GTGCCTATTT TGATTCCACG TGGAAGGGTC TAACAGCTCT AGGAGGCCTG TATTTCATGT 1440 

TTCTTGTTGA ACATGTCCTC ACATTGATCA AACAATTTAA AGATAAGAAG AAAAAGAATC 1500 

AGAAGAAACC TGAAAATGAT GATGATGTGG AGATTAAGAA GCAGTTGTCC AAGTATGAAT 1560 

CTCAACTTTC AACAAATGAG GAGAAAGTAG ATACAGATGA TCGAACTGAA GGCTATTTAC 1620 

GAGCAGACTC ACAAGAGCCC TCCCACTTTG ATTCTCAGCA GCCTGCAGTC TTGGAAGAAG 1680 

AAGAGGTCAT GATAGCTCAT GCTCATCCAC AGGAAGTCTA CAATGAATAT GTACCCAGAG 1740 

GGTGCAAGAA TAAATGCCAT TCACATTTCC ACGATACACT CGGCCAGTCA GACGATCTCA 1800 

TTCACCACCA TCATGACTAC CATCATATTC TCCATCATCA CCACCACCAA AACCACCATC 1860 

CTCACAGTCA CAGCCAGCGC TACTCTCGGG AGGAGCTGAA AGATGCCGGC GTCGCCACTT 1920 

TGGCCTGGAT GGTGATAATG GGTGATGGCC TGCACAATTT CAGCGATGGC CTAGCAATTG 1980 

GTGCTGCTTT TACTGAAGGC TTATCAAGTG GTTTAAGTAC TTCTGTTGCT GTGTTCTGTC 2040 

ATGAGTTGCC TCATGAATTA GGTGACTTTG CTGTTCTACT AAAGGCTGGC ATGACCGTTA 2100 

AGCAGGCTGT CCTTTATAAT GCATTGTCAG CCATGCTGGC GTATCTTGGA ATGGCAACAG 2160 

GAATTTTCAT TGGTCATTAT GCTGAAAATG TTTCTATGTG GATATTTGCA CTTACTGCTG 2220 

GCTTATTCAT GTATGTTGCT CTGGTTGATA TGGTACCTGA AATGCTGCAC AATGATGCTA 2280 

GTGACCATGG ATGTAGCCGC TGGGGGTATT TCTTTTTACA GAATGCTGGG ATGCTTTTGG 2340 

GTTTTGGAAT TATGTTACTT ATTTCCATAT TTGAACATAA AATCGTGTTT CGTATAAATT 2400 

TCTAGTTAAG GTTTAAATGC TAGAGTAGCT TAAAAAGTTG TCATAGTTTC AGTAGGTCAT 2460 

AGGGAGATGA GTTTGTATGC TGTACTATGC AGCGTTTAAA GTTAGTGGGT TTTGTGATTT 2520 

TTGTATTGAA TATTGCTGTC TGTTACAAAG TCAGTTAAAG GTACGTTTTA ATATTTAAGT 2580 

TATTCTATCT TGGAGATAAA ATCTGTATGT GCAATTCACC GGTATTACCA GTTTATTATG 2640 

TAAACAAGAG ATTTGGCATG ACATGTTCTG TATGTTTCAG GGAAAAATGT CTTTAATGCT 2700 

TTTTCAAGAA CTAACACAGT TATTCCTATA CTGGATTTTA GGTCTCTGAA GAACTGCTGG 2760 

TGTTTAGGAA TAAGAATGTG CATGAAGCCT AAAATACCAA GAAAGCTTAT ACTGAATTTA 2820 

AGCAAAGAAA TAAAGGAGAA AAGAGAAGAA TCTGAGAATT GGGGAGGCAT AGATTCTTAT 2880 

AAAAATCACA AAATTTGTTG TAAATTAGAG GGGAGAAATT TAGAATTAAG TATAAAAAGG 2940 

CAGAATTAGT ATAGAGTACA TTCATTAAAC ATTTTTGTCA GGATTATTTC CCGTAAAAAC 3000 

GTAGTGAGCA CTCTCATATA CTAATTAGTG TACATTTAAC TTTGTATAAT ACAGAAATCT 3060 

AAATATATTT AATGAATTCA AGCAATATAC ACTTGACCAA GAAATTGGAA TTTCAAAATG 3120 

TTCGTGCGGG TTATATACCA GATGAGTACA GTGAGTAGTT TATGTATCAC CAGACTGGGT 3180 

TATTGCCAAG TTATATATCA CCAAAAGCTG TATGACTGGA TGTTCTGGTT ACCTGGTTTA 3240 
CAAAATTATC AGAGTAGTAA AACTTTGATA TATATGAGGA TATTAAAACT ACACTAAGTA 3300 
TCATTTGATT CGATTCAGAA AGTACTTTGA TATCTCTCAG TGCTTCAGTG CTATCATTGT 3360 
GAGCAATTGT CTTTATATAC GGTACTGTAG CCATACTAGG CCTGTCTGTG GCATTCTCTA 3420 
GATGTTTCTT TTTTACACAA TAAATTCCTT ATATCAGCTT G 



SEQ ID NO:181 BCR4 PROTEIN SEQUENCE 
Protein Accession*: NP_036451 



1 11 21 31 41 51 

MARKLSVILI LTFALSVTNP LHELKAAAFP QTTEKISPNW ESGINVDLAI STRQYHLQQL 60 

FYRYGENNSL SVEGFRKLLQ NIGIDKIKRI HIHHDHDHHS DHEHHSDHER HSDHEHHSDH 120 

EHHSDHDHHS HHNHAASGKN KRKALCPDHD SDSSGKDPRN SQGKGAHRPE HASGRRNVKD 180 

SVSASEVTST VYNTVS EGTH FLETIETPRP GKLFPKDVSS STPPSVTSKS RVSRLAGRKT 240 

NESVSEPRKG FMYSRNTNEN PQECFNASKL LTSHGMGIQV PLNATEFNYL CPAIINQIDA 300 

RSCLIHTSEK KAEIPPKTYS LQIAWVGGFI AISIISFLSL LGVILVPLMN RVFFKFLLSF 360 

LVALAVGTLS GDAFLHLLPH SHASHHHSHS HEEPAMEMKR GPLFSHLSSQ NIEESAYFDS 420 

TWKGLTALGG LYFMFLVEHV LTLIKQFKDK KKKNQKKPEN DDDVEIKKQL SKYESQLSTN 480 

EEKVDTDDRT EGYLRADSQE PSHFDSQQPA VLEEEEVMIA HAHPQEVYNE YVPRGC KNKC 540 

HSHFHDTLGQ SDDLIHHHHD YHHILHHHHH QNHHPHSHSQ RYSREELKDA GVATLAWMVI 600 

MGDGLHNFSD GLAIGAAFTE GLSSGLSTSV AVFCHELPHE U5DFAVLLKA GMTVKQAVLY 660 

NALSAMLAYL GMATGIFIGH YAENVSMWIF ALTAGLFMYV ALVDMVPEML HNDASDHGCS 720 
RWGYFFLQNA GMLLGFGIML LISIFEHKIV FRINF 
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SEQ 10 NO:182 BCY2 DNA sequence 

Nucleic Acid Accession #: NMJXM203 

Coding sequence: 274-1 782 (underlined sequences correspond to start and stop codons} 
5 1 11 21 31 41 51 

I t E 1 I 1 

CGCGGGGCGC GGAGTCGGCG GGGCCTCGCG GGACGCGGGC AGTGCGGAGA CCGCGGCGCT 60 
GAGGACGCGG GAGCCGGGAG CGCACGCGCG GGGTGGAGTT CAGCCTACTC TTTCTTAGAT 120 
GTGAAAGGAA AGGAAGATCA TTTCATGCCT TGTTGATAAA GGTTCAGACT TCTGCTGATT 180 
I U CATA ACC ATT TGGCTCTG AG CTATGACA AG AG AGGAAAC A AAA AGTTA AA CTTACAAGCC 240 
TGCCATAAGT GAGAAGCAAA CTTCCTTGAT AACATGCTTT TGCGAAGTGC AGGAAAATTA 300 
AATGTGGGCA CCAAGAAAGA GG ATGGTGAG AGTACAGCCC CCACCCCCCG TCCAAAGGTC 360 
TTGCGTTGTA AATGCCACCA CCATTGTCCA GAAGACTCAG TCAACAATAT TTGCAGCACA 420 
GACGGATATT GTTTCACGAT GATAGAAGAG GATGACTCTG GGTTGCCTGT GGTCACTTCT 480 
1 D GGTTGCCTAG GACTAGAAGG CTCAGATTTT CAGTGTCGGG ACACTCCCAT TCCTCATCAA 540 
AGAAGATCAA TTGAATGCTG CACAGAAAGG AACGAATGTA ATAAAGACCT ACACCCTACA 600 
CTGCCTCCAT TGAAAAACAG AGATTTTGTT GATGGACCTA TACACCACAG GGCTTTACTT 660 
ATATCTGTGA CTGTCTGTAG TTTGCTCTTG GTCCTTATCA TATTATTTTG TTACTTCCGG 720 
TATAAAAGAC AAGAAACCAG ACCTCGATAC AGCATTGGGT TAGAACAGGA TGAAACTTAC 780 

ZU ATTCCTCCTG GAGAATCCCT G AG AGACTTA ATTGAGCAGT CTCAGAGCTC AGGAAGTGGA 840 
TCAGGCCTCC CTCTGCTGGT CCAAAGGACT ATAGCTA AGC AG ATTCAG AT GGTG A A ACAG 900 
ATTGGAAAAG GTCGCTATGG GGAAGTTTGG ATGGGAAAGT GGCGTGGCGA AAAGGTAGCT 960 
GTGAAAGTGT TCTTCACCAC AGAGG AAGCC AGCTGGTTCA GAGAGACAGA AATATATCAG 1020 
ACAGTGTTGA TGAGGCATGA AAAC ATTTTG GGTTTCATTG CTGCAGATAT CAAAGGGACA 1080 

ZD GGGTCCTGGA CCCAGTTGTA CCTAATCACA GACTATCATG AAAATGGTTC CCTTTATGAT 1 140 
TATCTGAAGT CCACCACCCT AGACGCTAAA TCAATGCTGA AGTTAGCCTA CTCTTCTGTC 1200 
AGTGGCTTAT GTCATTTACA CACAGAAATC TTTAGTACTC AAGGCAAACC AGCAATTGCC 1260 
CATCGAGATC TGAAAAGTAA AAACATTCTG GTGAAGAAAA ATGGAACTTG CTGTATTGCT 1320 
GACCTGGGCC TGGCTGTTA A ATTTATTAGT GATACAAATG AAGTTGAC AT ACCACCTAAC 1 380 

OU ACTCGAGTTG GCACCAAACG CTATATGCCT CCAGAAGTGT TGGACGAGAG CTTGAACAGA 1440 
AATCACTTCC AGTCTTACAT CATGGCTGAC ATGTATAGTT TTGGCCTCAT CCTTTGGG AG 1500 
GTTGCTAGGA GATGTGTATC AGGAGGTATA GTGGAAGAAT ACCAGCTTCC TTATCATGAC 1560 
CTAGTGCCCA GTGACCCCTC TTATGAGGAC ATGAGGGAGA TTGTGTGCAT CAAGAAGTTA 1620 
CGCCCCTCAT TCCCAAACCG GTGGAGCAGT GATGAGTGTC TAAGGCAGAT GGGAAAACTC 1680 

3D ATGACAGAAT GCTGGGCTCA CAATCCTGCA TCAAGGCTGA CAGCCCTGCG GG1TAAGAAA 1740 
ACACTTGCCA A A ATG TC AG A GTCCCAGGAC ATTAAACT CT GA TAGGAGAG GAAAAGTAAG 1800 
CATCTCTGCA GAAAGCCAAC AGGTACTCTT CTGTTTGTGG GCAGAGCAAA AGACATCAAA 1860 
TAAGCATCCA CAGTACAAGC CTTGAACATC GTCCTGCTTC CCAGTGGGTT CAGACCTCAC 1920 
CTTTCAGGGA GCGACCTGGG CAAAGACAGA GAAGCTCCCA GAAGGAGAGA TTGATCCGTG 1980 

4U TCTGTTTGTA GGCGGAGAAA CCGTTGGGTA ACTTGTTCAA GATATGATGC AT 



45 



SEQ ID NO:183 BCY2 Protein seouence 

Protein Accession #: NP_001194 



I 11 21 31 41 51 
I I I I I 



MIXRS AGKLN VGTKKEDGES TAPTPRPKVL RCKCHHHCPE DSVNNICSTD GYCFTMIEED 60 
DSGLPVVTSG CLGLEGSDFQ CRDTPIPHQR RSIECCTERN ECNKDLHPTL PPLKNRDFVD 120 
DU GPIHHRALLI SVTVCSLLLV LIILFCYFRY KRQETRPRYS IGLEQDETYI PPGESLRDLI 180 

EQSQSSGSGS GLPLLVQRTI AKQIQMVKQI GKGRYGEVWM GKWRGEKVAV KVFFTTEEAS 240 
WFRETEIYQT VLMRHENILG FIAADIKGTG SWTQLYLITD YHENGSLYDY LKSTTLDAKS 300 
MLKLAYSSVS GLCHLHTEIF STQGKPAIAH RDLKSKNILV KKNGTCCIAD LGLAVKFISD 360 
TNEVDIPPNT RVGTKRYMPP EVLDESLNRN HFQSYIMADM YSFGLILWEV ARRCVSGGIV 420 
DD EEYQLPYHDL VPSDPSYEDM REIVCIKKLR PSFPNRWSSD ECLRQMGKLM TECWAHNPAS 480 
RLTALRVKKT LAKMSESQD1 KL 



60 SEQ ID NO: 184 C8F9 DNA sequence 

Nucleic Acid Accession #: AC005383 

Coding Sequence: 328-2751 (underlined sequences correspond to start and stop codons) 

65 1 11 21 31 41 51 

till!! 

GACAGTGTTC GCGGCTGCAC CGCTCGGAGG CTGGGTGACC CGCGTAGAAG TGAAGTACTT 60 

TTTTATTTGC AGACCTGGGC CGATGCCGCT TTAAAAAACG CGAGGGGCTC TATGCACCTC 120 

CCTGGCGGTA GTTCCTCCGA CCTCAGCCGG GTCGGGTCGT GCCGCCCTCT CCCAGGAGAG 180 

/U ACAAACAGGT GTCCCACGTG GCAGCCGCGC CCCGGGCGCC CCTCCTGTGA TCCCGTAGCG 240 

CCCCCTGGCC CGAGCCGCGC CCGGGTCTGT GAGTAGAGCC GCCCGGGCAC CGAGCGCTGG 300 

TCGCCGCTCT CCTTCCGTTA TATCAACATG CCCCCTTTCC TGTTGCTGGA GGCCGTCTGT 360 

GTTTTCCTGT TTTCCAGAGT GCCCCCATCT CTCCCTCTCC AGGAAGTCCA TGTAAGCAAA 420 

- GAAACCATCG GGAAGATTTC AGCTGCCAGC AAAATGATGT GGTGCTCGGC TGCAGTGGAC 480 

/D ATCATGTTTC TGTTAGATGG GTCTAACAGC GTCGGGAAAG GGAGCTTTGA AAGGTCCAAG 540 

CACTTTGCCA TCACAGTCTG TGACGGTCTG GACATCAGCC CCGAGAGGGT CAGAGTGGGA 600 

GCATTCCAGT TCAGTTCCAC TCCTCATCTG GAATTCCCCT TGGATTCATT TTCAACCCAA 660 
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CAGGAAGTGA AGGCAAGAAT CAAGAGGATG GTTTTCAAAG GAGGGCGCAC GGAGACGGAA 720 
CTTGCTCTGA AATACCTTCT GCACAGAGGG TTGCCTGGAG GCAGAAATGC TTCTGTGCCC 780 
CAGATCCTCA TCATCGTCAC TGATGGGAAG TCCCAGGGGG ATGTGGCACT GCCATCCAAG 840 
CAGCTGAAGG AAAGGGGTGT CACTGTGTTT GCTGTGGGGG TCAGGTTTCC CAGGTGGGAG 900 
5 GAGCTGCATG CACTGGCCAG CGAGCCTAGA GGGCAGCACG TGCTGTTGGC TGAGCAGGTG 960 
GAGGATGCCA CCAACGGCCT CTTCAGCACC CTCAGCAGCT CGGCCATCTG CTCCAGCGCC 1020 
ACGCCAGACT GCAGGGTCGA GGCTCACCCC TGTGAGCACA GGACGCTGGA GATGGTCCGG 1080 
GAGTTCGCTG GCAATGCCCC ATGCTGGAGA GGATCGCGGC GGACCCTTGC GGTGCTGGCT 1140 
GCACACTGTC CCTTCTACAG CTGGAAGAGA GTGTTCCTAA CCCACCCTGC CACCTGCTAC 1200 
10 AGGACCACCT GCCCAGGCCC CTGTGACTCG CAGCCCTGCC AGAATGGAGG CACATGTGTT 1260 

CCAGAAGGAC TGGACGGCTA CCAGTGCCTC TGCCCGCTGG CCTTTGGAGG GGAGGCTAAC 1320 

TGTGCCCTGA AGCTGAGCCT GGAATGCAGG GTCGACCTCC TCTTCCTGCT GGACAGCTCT 1380 

GCGGGCACCA CTCTGGAC GG CTTCCTGCGG GCCAAAGTCT TCGTGAAGCG GTTTGTGCGG 1440 

GCCGTGCTGA GCGAGGACTC TCGGGCCCGA GTGGGTGTGG CCACATACAG CAGGGAGCTG 1500 
15 CTGGTGGCGG TGCCTGTGGG GGAGTAC C AG GATGTGCCTG ACCTGGTCTG GAGCCTCGAT 1560 

GGCATTCCCT TCCGTGGTGG CCCCACCCTG ACGGGCAGTG CCTTGCGGCA GGCGGCAGAG 1620 

CGTGGCTTCG GGAGCGCCAC CAGGACAGGC CAGGACCGGC CACGTAGAGT GGTGGTTTTG 1680 

CTCACTGAGT CACACTCCGA GGATGAGGTT GCGGGCCCAG CGCGTCACGC AAGGGCGCGA 1740 

GAGCTGCTCC TGCTGGGTGT AGGCAGTGAG GCCGTGCGGG CAGAGCTGGA GGAGATCACA 1800 
20 GGCAGCCCAA AGCATGTGAT GG TCTACTCG GATCCTCAGG ATCTGTTCAA CCAAATCCCT 1860 

GAGCTGCAGG GGAAGCTGTG CAGCCGGCAG CGGCCAGGGT GCCGGACACA AGCCCTGGAC 1920 

CTCGTCTTCA TGTTGGACAC CTCTGCCTCA GTAGGGCCCG AGAATTTTGC TCAGATGCAG 1980 

AGCTTTGTGA GAAGCTGTGC CCTCCAGTTT GAGGTGAACC CTGACGTGAC ACAGGTCGGC 2040 

CTGGTGGTGT ATGGCAGCCA GGTGCAGACT GCCTTCGGGC TGGACACCAA ACCCACCCGG 2100 
25 GCTGCGATGC TGCGGGCCAT TAGCCAGGCC CCCTACCTAG GTGGGGTGGG CTCAGCCGGC 2160 

ACCGCCCTGC TGCACATCTA TGACAAAGTG ATGACCGTCC AGAGGGGTGC CCGGCCTGGT 2220 

GTCCCCAAAG CTGTGGTGGT GCTCACAGGC GGG AG AGGCG CAGAGGATGC AGCCGTTCCT 2280 

GCCCAGAAGC TGAGGAACAA TGGCATCTCT GTCTTGGTCG TGGGCGTGGG GCCTGTCCTA 2340 

AGTGAGGGTC TGCGGAGGCT TGCAGGTCCC CGGGATTCCC TGATCCACGT GGCAGCTTAC 2400 
30 GCCGACCTGC GGTACCACCA GGACGTGCTC ATTGAGTGGC TGTGTGGAGA AGCCAAGCAG 2460 

CCAGTCAACC TCTGCAAACC CAGCCCGTGC ATGAATGAGG GCAGCTGCGT CCTGCAGAAT 2520 

GGGAGCTACC GCTGCAAGTG TCGGGATGGC TGGGAGGGCC CCCACTGCGA GAACCGTGAG 2580 

TGGAGCTCTT GCTCTGTATG TGTGAGCCAG GGATGGATTC TTGAGACGCC CCTGAGGCAC 2640 

ATGGCTCCCG TGCAGGAGGG CAGCAGCCGT ACCCCTCCCA GCAACTACAG AGAAGGCCTG 2700 
35 GGCACTGAAA TGGTGCCTAC CTTCTGGAAT GTCTGTGCCC CAGGTCCTTA GA ATGTCTGC 2760 

TTCCCGCCGT GGCCAGGACC ACTATTCTCA CTGAGGGAGG AGGATGTCCC AACTGCAGCC 2820 

ATGCTGCTTA GAGACAAGAA AGCAGCTGAT GTCACCCACA AACGATGTTG TTGAAAAGTT 2880 

TTGATGTGTA AGTAAATACC CACTTTCTGT ACCTGCTGTG CCTTGTTGAG GCTATGTCAT 2940 

CTGCCACCTT TCCCTTGAGG ATAAACAAGG GGTCCTGAAG ACTTAAATTT AGCGGCCTGA 3000 
40 CGTTCCTTTG CACACAATCA ATGCTCGCCA GAATGTTGTT GACACAGTAA TGCCCAGCAG 3060 

AGGCCTTTAC TAGAGCATCC TTTGGACGGC GAAGGCCACG GCCTTTCAAG ATGGAAAGCA 3120 

GCAGCTTTTC CACTTCCCCA GAGACATTCT GGATGCATTT GCATTGAGTC TGAAAGGGGG 3180 

CTTGAGGGAC GTTTGTGACT TCTTGGCGAC TGCCTTTTGT GTGTGGAAGA G AC TTGG AAA 3240 

GGTCTCAGAC TGAATGTGAC CAATTAACCA GCTTGGTTGA TGATGGGGGA GGGGCTGAGT 3300 
45 TGTGCATGGG CCCAGGTCTG GAGGGCCACG TAAAATCGTT CTGAGTCGTG AGCAGTGTCC 3360 

ACCTTGAAGG TCTTC 

$EQ ID NQ:185 CBF9 Protein sequence 
Protein Accession #: none found 

50 

1 11 21 31 41 51 

MPFFLLLEAV CVFLFSRVPP SLPLQEVHVS KETIGKISAA SKMHWCSAAV DIMPLLDGSN 60 
55 SVGKGSFERS KHFAITVCDG LDISPERVRV GAFQFSSTPH LEFPLDSFST QQEVKARIKR 120 

MVFKGGRTET ELALKYLLHR GLPGGRNASV PQILIIVTDG KSQGDVALPS KQLKERGVTV 180 

FAVGVRFPRW EELHALASEP RGQHVLLAEQ VEDATNGLFS TLSSSAICSS ATPDCRVEAH 240 

PCEHRTLEMV REFAGNAPCW RGSRRTLAVL AAHCPFYSWK RVFLTHPATC YRTTC PG PCD 300 

SQPCQNGGTC VPEGLDGYQC LCPLAFGGEA NCALKLSLEC RVDLLFLLDS SAGTTLDGFL 360 
60 RAKVFVKRFV RAVLSEDSRA RVGVATYSRE LLVAVPVGEY QDVPDLVWSL DGIPFRGGPT 420 

LTGSALRQAA ERGFGSATRT GQDRPRRVW LLTESHSEDE VAGPARHARA RELLLLGVGS 480 

EAVRAELEEI TGSPKHVMVY SDPQDLFNQI PELQGKLCSR QRPGCRTQAL DLVFMLDTSA 540 

SVGPENFAQM QSFVRSCALQ FEVNPDVTQV GLWYGSQVQ TAFGLDTKPT RAAMLRAISQ 600 

APYLGGVGSA GTALLHIYDK VMTVQRGARP GVPKAVWLT GGRGAEDAAV PAQKLRNNGI 660 
65 SVLWGVGPV LSEGLRRLAG PRDS LIHVAA YADLRYHQDV LIEWLCGEAK QPVNLCKPSP 720 

CMNEGSCVLQ NGSYRCKCRD GWEGPHCENR EWSSCSVCVS QGWILETPLR HMAPVQEGSS 780 

RTPPSNYREG LGTEMVPTFW NVCAPGP 



70 SEQ ID NO:186 PAV1 DNA sequence 

Nucleic Acid Accession*: AF272890 

Coding Sequence: 87-1 520 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

75 | i I 1 1 1 

TGCTACCCGC GCCCGGGCTT CTGGGGTGTT CCCCAACCAC GGCCCAGCCC TGCCACACCC 60 

CCCGCCCCCG GCCTCCGCAG CTCGGCATGG GCGCGGGGGT GCTCGTCCTG GGCGCCTCCG 120 

AGCCCGGTAA CCTGTCGTCG GCCGCACCGC TCCCCGACGG CGCGGCCACC GCGGCGCGGC 180 
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10 



15 



20 



*25 



TGCTGGTGCC 
CGCTGTCTCA 
TCGTGGCGGG 
TCACCAACCT 
TGCCGTTCGG 
AGCTGTGGAC 
TTGCCCTGGA 
GCGCGCGGGC 
TGCCCATCCT 
ACCCCAAGTG 
CCTTCTACGT 
AGAAGCAGGT 
CGCCCTCGCC 
CCGCCGCCGC 
CGCGCCTCGT 
TCTTCACGCT 
AGCTGGTGCC 
TCAACCCCAT 
GCTGCGCGCG 
CGGGCTGTCT 
ACGACGATGT 
ACGGCQGGGC 
CCTCGGAATC 
GGGAACGAGG 
CCTCGTCTGA 
TTTGGGAAGG 



CGCGTCGCCG 
GCAGTGGACA 
CAATGTGCTG 
CTTCATCATG 
GGCCACCATC 
CTCAGTGGAC 
CCGCTACCTC 
GCGGGGCCTC 
CATGCACTGG 
CTGCGACTTC 
GCCCCTGTGC 
GAAGAAGATC 
CTCGCCCTCG 
CGCCGCCACC 
GGCCCTACGC 
CTGCTGGCTG 
CGACCGCCTC 
CATCTACTGC 
CAGGGCTGCC 
GGCCCGGCCC 
CGTCGGGGCC 
GGCGGCGGAC 
CAAGGTGTAG 
AGATCTGTGT 
ATCATCCGAG 
GATGGGAGAG 



CCCGCCTCGT 
GCGGGCATGG 
GTGATCGTGG 
TCCCTGGCCA 
GTGGTGTGGG 
GTGCTGTGCG 
GCCATCACCT 
GTGTGCACCG 
TGGCGGGCGG 
GTCACCAACC 
ATCATGGCCT 
GACAGCTGCG 
CCCGTCCCCG 
GCCCCGCTGG 
GAGCAGAAGG 
CCCTTCTTCC 
TTCGTCTTCT 
CGCAGCCCCG 
CGCCGGCGCC 
GGACCCCCGC 
ACGCCGCCCG 
AGCGACTCGA 
GGCCCGGCGC 
TTACTTAAGA 
GCAAAGAGAA 
TGGCTTGCTG 



TGCTGCCTCC 
GTCTGCTGAT 
CCATCGCCAA 
GCGCCGACCT 
GCCGCTGGGA 
TGACGGCCAG 
CGCCCTTCCG 
TGTGGGCCAT 
AGAGCGACGA 
GGGCCTACGC 
TCGTGTACCT 
AGCGCCGTTT 
CGCCCGCGCC 
CCAACGGGCG 
CGCTCAAGAC 
TGGCCAACGT 
TCAACTGGCT 
ACTTCCGCAA 
ACGCGACCCA 
CATCGCCCGG 
CGCGCCTGCT 
GCCTGGACGA 
GGGGCGCGGA 
CCGATAGCAG 
AAGCCACGGA 
ATGTTCCTTG 



CGCCAGCGAA 
GGCGCTCATC 
GACGCCGCGG 
GGTCATGGGG 
GTACGGCTCC 
CATCGAGACC 
CTACCAGAGC 
CTCGGCCCTG 
GGCGCGCCGC 
CATCGCCTCG 
GCGGGTGTTC 
CCTCGGCGGC 
GCCGCCCGGA 
TGCGGGTAAG 
GCTGGGCATC 
GGTGAAGGCC 
GGGCTACGCC 
GGCCTTCCAG 
CGGAGACCGG 
GGCCGCCTCG 
GGAGCCCTGG 
GCCGTGCCGC 
CTCCGGGCAC 
GTGAACTCGA 
CCGTTGCACA 
TTG 



AGCCCCGAGC 
GTGCTGCTCA 
CTGCAGACGC 
CTGCTGGTGG 
TTCTTCTGCG 
CTGTGTGTCA 
CTGCTGACGC 
GTGTCCTTCC 
TGCTACAACG 
TCCGTAGTCT 
CGCGAGGCCC 
CCAGCGCGGC 
CCCCCGCGCC 
CGGCGGCCCT 
ATCATGGGCG 
TTCCACCGCG 
AACTCGGCCT 
GGACTGCTCT 
CCGCGCGCCT 
GACGACGACG 
GCCGGCTGCA 
CCCGGCTTCG 
GGCTTCCCAG 
AGCCCACAAT 
AAAAGGAAAG 



240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 



,30 
=40 



MGAGVLVLGA 
MGLLMALIVL 
WGRWEYGSFF 
TVWAISALVS 
AFVYLRVFRE 
LANGRAGKRR 
FFNWLGYANS 
PPSPGAASDD 



11 

I 

SEPGNLSSAA 
LIVAGNVLVI 
CELWTSVDVL 
FLFILMHWWR 
AQKQVKKIDS 
FSRLVALREQ 
AFNPIIYCRS 
DDDDWGATP 



21 
I 

FLPDGAATAA 
VAIAKTPRLQ 
CVTASIETLC 
AESDEARRCY 
CERRFLGGPA 
KALKTLGIIM 
FDFRKAFQGL 
PARLLEPWAG 



31 

! 

RLLVPASPPA 
TLTNLFIMSL 
VIALDRYLAI 
NDPKCCDFVT 
RPPSPSPSPV 
GVFTLCWLPF 
LCCARRAARR 
CNGGAAADSD 



41 
I 

SLLPPASESP 
ASADLVMGLL 
TSPFRYQSLL 
KRAYAIASSV 
PAPAPPPGPP 
FLANWKAFH 
RHATHGDRPR 
SSLDEPCRPG 



51 
I 

EPLSQQWTAG 
WPFGATIW 
TRARARGLVC 
VSFYVPLCIM 
RPAAAAATAP 
RELVPDRLFV 
ASGCLARPGP 
FASESKV 



Protein Accession #: 



60 
120 
180 
240 
300 
360 
420 



SEQ ID N0.-187 PAV1 Protein sequence 
AA011176 



: 45 
50 
55 
60 
65 
70 
75 
80 



SEQ ID N0:1 88 BC02 DNA sequence 
Nucleic Acid Accession #: AJ400877 
Coding sequence: 



81-3080 (underlined sequences correspond to start and stop codons) 



1 



11 



21 



31 



41 



51 



f ! 1 



GGCGTCCGCG CACACCTCCC CGCGCCGCCG CCGCCACCGC CCGCACTCCG CCGCCTCTGC 60 
CCGCAACCGC TGAGCCATCC ATQGGGGTCG CGGGCCGCAA CCGTCCCGGG GCGGCCTGGG 120 
CGGTGCTGCT GCTGCTGCTG CTGCTGCCGC CACTGCTGCT GCTGGCGGGG GCCGTCCCGC 1 80 
CGGGTCGGGG CCGTGCCGCG GGGCCGCAGG AGGATGTAGA TGAGTGTGCC CAAGGGCTAG 240 
ATGACTGCCA TGCCGACGCC CTGTGTCAGA ACACACCCAC CTCCTACAAG TGCTCCTGCA 300 
AGCCTGGCTA CCAAGGGGAA GGCAGGCAGT GTG AGG ACAT CGATGAATGT GGAAATGAGC 360 
TCAATGGAGG CTGTGTCCAT G ACTGTTTGA ATATTCCAGG CAATTATCGT TGCACTTGTT 420 
TTG ATGGCTT CATGTTGGCT CATGACGGTC ATAATTGTCT TGATGTGGAC GAGTGCCTGG 480 
AG AACAATGG CGGCTGCCAG CATACCTGTG TCAACGTCAT GGGGAGCTAT GAGTGCTGCT 540 
GCAAGG AGGG GTTTTTCCTG AGTGACAATC AGCACACCTG CATTCACCGC TCGGAAGAGG 600 
GCCTGAGCTG CATG A ATAAG G ATCACGGCT GTAGTC ACAT CTGCAAGGAG GCCCCAAGGG 660 
GCAGCGTCGC CTGTGAGTGC AGGCCTGGTT TTGAGCTGGC CAAGAACCAG AGAGACTGCA 720 
TCTTGACCTG TAACCATGGG AACGGTGGGT GCCAGCACTC CTGTGACGAT ACAGCCGATG 780 
GCCCAGAGTG CAGCTGCCAT CCAC AGTAC A AGATGCACAC AGATGGGAGG AGCTGCCTTG 840 
AGCGAGAGGA CACTGTCCTG GAGGTGACAG AGAGCAACAC CACATCAGTG GTGGATGGGG 900 
ATAAACGGGT GAAACGGCGG CTGCTCATGG AAACGTGTGC TGTCAACAAT GGAGGCTGTG 960 
ACCGCACCTG TAAGGATACT TCG ACAGGTG TCCACTGCAG TTGTCCTGTT GGATTCACTC 1020 
TCCAGTTGGA TGGG AAGACA TGTAAAGATA TTGATGAGTG CCAGACCCGC AATGGAGGTT 1080 
GTGATCATTT CTGCAAAAAC ATCGTGGGCA GTTTTGACTG CGGCTGCAAG AAAGG ATTTA 1 140 
AATTATTAAC AG ATG AG A AG TCTTGCCAAG ATGTGGATGA GTGCTCTTTG GATAGGACCT 1200 
GTGACCACAG CTGCATCAAC CACCCTGGCA CATTTGCTTG TGCTTGCAAC CG AGGGTACA 1260 
CCCTGTATGG CTTCACCCAC TGTGGAGACA CCAATGAGTG CAGCATCAAC AACGGAGGCT 1320 
GTCAGCAGGT CTGTGTGAAC ACAGTGGGCA GCTATGAATG CCAGTGCCAC CCTGGGTACA 1380 
AGCTCCACTG GAATAAAAAA GACTGTGTGG A AG TG A AGGG GCTCCTGCCC ACAAGTGTGT 1440 
CACCCCGTGT GTCCCTGCAC TGCGGTAAGA GTGGTGGAGG AG ACGGGTGC TTCCTCAG AT 1500 
GTCACTCTGG CATTCACCTC TCTTCAGATG TCACCACCAT CAGGACAAGT GTAACCTTTA 1560 
AGCTAAATGA AGGCAAGTGT AGTTTGAAAA ATGCTGAGCT GTTTCCCGAG GGTCTGCGAC 1620 
CAGCACTACC AGAGAAGCAC AGCTCAGTAA AAGAGAGCTT CCGCTACGTA AACCTTACAT 1680 
GCAGCTCTGG CAAGCAAGTC CCAGGAGCCC CTGGCCGACC AAGCACCCCT AAGGAAATGT 1740 
TTATCACTGT TGAGTTTGAG CTTG AA ACT A ACCAAAAGGA GGTGACAGCT TCTTGTGACC 1800 
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TGAGCTGCAT CGTAAAGCG A ACCGAGAAGC GGCTCCGTAA AGCCATCCGC ACGCTCAGAA 1860 
AGGCCGTCCA CAGGGAGCAG TTTCACCTCC AGCTCTCAGG CATGAACCTC GACGTGGCTA 1920 
AAAAGCCTCC CAGAACATCT GAACGCCAGG CAGAGTCCTG TGGAGTGGGC CAGGGTCATG 1980 
CAGAAAACCA ATGTGTCAGT TGCAGGGCTG GGACCTATTA TGATGGAGCA CGAGAACGCT 2040 
5 GCATTTTATG TCCAAATGGA ACCTTCCAAA ATGAGGAAGG ACAAATGACT TGTG A ACC AT 2100 
GCCCAAGACC AGGAAATTCT GGGGCCCTGA AGACCCCAGA AGCTTGGAAT ATGTCTGAAT 2160 
GTGG AGGTCT GTGTCAACCT GGTG AATATT CTGCAGATGG CTTTGCACCT TGCCAGCTCT 2220 
GTGCCCTGGG CACGTTCCAG CCTGAAGCTG GTCG AACTTC CTGCTTCCCC TGTGGAGGAG 2280 
GCCTTGCCAC CAAACATCAG GGAGCTACTT CCTTTCAGG A CTGTGAAACC AG AGTTCAAT 2340 

10 GTTCACCTGG ACATTTCTAC AACACCACCA CTCACCGATG TATTCGTTGC CCAGTGGGAA 2400 
CATACCAGCC TGAATTTGGA AA AAATAATT GTGTTTCTTG CCCAGGAAAT ACTACGACTG 2460 
ACTTTGATGG CTCCACAAAC ATAACCCAGT GTAAAAACAG AAGATGTGGA GGGGAGCTGG 2520 
GAGATTTCAC TGGGTACATT GAATCCCCAA ACTACCCAGG CAATTACCCA GCCAACACCG 2580 
AGTGTACGTG GACCATCAAC CCACCCCCCA AGCGCCGCAT CCTGATCGTG GTCCCTGAGA 2640 

1 5 TCTTCCTGCC CATAG AGG AC GACTGTGGGG ACTATCTGGT GATGCGGAAA ACCTCTTCAT 2700 
CCAATTCTGT GACAACATAT GAAACCTGCC AGACCTACGA ACGCCCCATC GCCTTCACCT 2760 
CCAGGTCAAA G AAGCTGTGG ATTCAGTTCA AGTCCAATGA AGGGAACAGC GCTAGAGGGT 2820 
TCCAGGTCCC ATACGTGACA TATGATGAGG ACTACCAGGA ACTCATTGAA GACATAGTTC 2880 
GAGATGGCAG GCTCTATGCA TCTGAGAACC ATCAGGAAAT ACTTAAGGAT AAGAAACTTA 2940 

20 TCAAGGCTCT GTTTGATGTC CTGGCCCATC CCCAGAACTA TTTCAAGTAC ACAGCCCAGG 3000 
AGTCCCGAGA GATGTTTCCA AGATCGTTCA TCCGATTGCT ACGTTCCAAA GTGTCCAGGT 3060 
TTTTGAG ACC TTACAAATGA CTCAGCCCAC GTGCCACTCA ATACAAATGT TCTGCTATAG 3120 
GGTTGGTGGG ACAGAGCTGT CTTCCTTCTG CATGTCAGCA CAGTCGGGTA TTGCTGCCTC 3180 
CCGTATCAGT GACTCATTAG AGTTCAATTT TTATAGATAA TACAGATATT TTGGTAAATT 3240 

25 GAACTTGGTT TTTCTTTCCC AGCATCGTGG ATGTAGACTG AGAATGGCTT TGAGTGGCAT 3300 
CAGCTTCTCA CTGCTGTGGG CGGATGTCTT GG ATAGATCA CGGGCTGGCT GAGCTGGACT 3360 
TTGGTCAGCC TAGGTG AGAC TCACCTGTCC TTCTGGGGTC TTACTCCTCC TCAAGGAGTC 3420 
TGTAGTGG A A AGGAGGCCAC AGAATAAGCT GCTTATTCTG AAACTTCAGC TTCCTCTAGC 3480 
CCGGCCCTCT CTAAGGG AGC CCTCTGCACT CGTGTGCAGG CTCTGACCAG GCAGAACAGG 3540 

30 CAAGAGGGGA GGGAAGGAGA CCCCTGCAGG CTCCCTCCAC CCACCTTGAG ACCTGGGAGG 3600 
ACTCAGTTTC TCCACAGCCT TCTCCAGCCT GTGTGATACA AGTTTGATCC CAGGAACTTG 3660 
AGTTCTAAGC AGTGCTCGTG AAAAAAAAAA GCAGAAAGAA TTAGAAATAA ATAAAAACTA 3720 
AGCACTTCTG GAGACAT 

35 SEQ ID NO:189 BC02 Protein sequence 

Protein Accession*: CAB92285 



1 11 21 31 41 51 

40 I I 1 | | I 

MGVAGRNRPG AAWAVLLLLL LLPPLLLLAG AVPPGRGRAA GPQEDVDECA QGLDDCHADA 60 
LCQNTPTSYK CSCKPG YQGE GRQCEDIDEC GNELNGGCVH DCLNIPGNYR CTCFDGFMLA 120 
HDGHNCLDVD ECLENNGGCQ HTCVNVMGSY ECCCKEGFFL SDNQHTCIHR SEEGLSCMNK 1 80 
DHGCSHICKE APRGSVACEC RPGFELAKNQ RDC1LTCNHG NGGCQHSCDD TADGPECSCH 240 

45 PQYKMHTDGR SCLEREDTVL EVTESNTTSV VDGDKRVKRR LLMETCAVNN GGCDRTCKDT 300 
STGVHCSCPV GFTLQLDGKT CKDIDECQTR NGGCDHFCKN IVGSFDCGCK KGFKLLTDEK 360 
SCQDVDECSL DRTCDHSCIN HPGTFACACN RGYTLYGFTH CGDTNECSIN NGGCQQVCVN 420 
TVGS YECQCH PGYKLHWNKK DCVEVKGLLP TSVSPRVSLH CGKSGGGDGC FLRCHSGIHL 480 
SSDVTTIRTS VTFKLNEGKC SLKNAELFPE GLRPALPEKH SSVKESFRYV NLTCSSGKQV 540 

50 PGAPGRPSTP KEMFITVEFE LETNQKEVTA SCDLSCIVKR TEKRLRKAIR TLRKAVHREQ 600 

FHLQLSGMNL DV AKKPPRTS ERQAESCG VG QGHAENQC VS CRAGTYYDGA RERCILCPNG 660 
TFQNEEGQMT CEPCPRPGNS GALKTPEAWN MSECGGLCQP GEYSADGFAP CQLCALGTFQ 720 
PEAGRTSCFP CGGGLATKHQ GATSFQDCET RVQCSPGHFY NTTTHRCIRC PVGTYQPEFG 780 
KNNCVSCPGN TTTDFDGSTN ITQCKNRRCG GELGDFTGYI ESPNYPGNYP ANTECTWTIN 840 

5 5 PPPKRRILIV VPE1FLPIED DCGDYLVMRK TSSSNS VTTY ETCQTYERPI AFTSRSKKLW 900 

IQFKSNEGNS ARGFQVPYVT YDEDYQELIE DIVRDGRLYA SENHQELLKD KKLI KALFD V 960 
LAHPQNYFKY TAQESREMFP RSFIRLLRSK VSRFLRPYK 

SEQ ID N0:190 BFG1 DNA sequence 
60 Nucleic Acid Accession*: AF007170 

Coding sequence: 1 -1 725 (underlined sequences correspond to stop codon) 

1 11 21 31 41 51 

65 AAGGAGGCGG CCTCCGGG A A AAGCGACCGC AGGACTCCTG AG AGCAGCCT CCATGAGGCC 60 
CTGGACCAGT GCATGACCGC CCTGGACCTC TTCCTCACCA ACCAGTTCTC AGAAGCACTC 120 
AGCTACCTC A AGCCC AG A AC C A AGG A AAGC ATGTACCACT CACTG AC ATA TGCC ACC ATC 1 80 
CTGGAGATGC AGGCCATG AT GACCTTTGAC CCTCAGGACA TCCTGCTTGC CGGCAACATG 240 
ATGAAGGAGG CACAG ATGCT GTGTCAGAGG CACCGGAGGA AGTCTTCTGT AACAGATTCC 300 

70 TTCAGCAGCC TGGTGAACCG CCCCACGCTG GGCCAATTCA CTGAAG AAGA AATCCACGCT 360 
GAGGTCTGCT ATGCAG AGTG CCTGCTGCAG CG AGCAGCCC TGACCTTCCT GCAGGACGAG 420 
AACATGGTGA GCTTCATCAA AGGCGGCATC AAAGTTCGAA ACAGCTACCA GACCTACAAG 480 
GAGCTGG ACA GCCTTGTTCA GTCCTCACAA TACTGCAAGG GTGAGAACCA CCCGCACTTT 540 
GAAGGAGGAG TG AAGCTTGG TGTAGGGGCC TTCAACCTGA CACTGTCCAT GCTTCCTACT 600 

75 AGGATCCTGA GGCTGTTGGA GTTTGTGGGG TTTTCAGGAA ACAAGGACTA TGGGCTGCTG 660 
CAGCTGG AGG AGGG AGCGTC AGGGC ACAGC TTCCGCTCTG TGCTCTGTGT C ATGCTCCTG 720 
CTGTGCTACC ACACCTTCCT CACCTTCGTG CTCGGTACTG GGAACGTCAA CATCGAGGAG 780 
GCCGAG AAGC TCTTGAAGCC CTACCTGAAC CGGTACCCTA AGGGTGCCAT CTTCCTGTTC 840 
TTTGCAGGGA GGATTGAAGT CATTAAAGGC AACATTGATG CAGCCATCCG GCGTTTCGAG 900 
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GAGTGCTGTG AGGCCCAGCA GCACTGGAAG CAGTTTCACC ACATGTGCTA CTGGGAGCTG 960 
ATGTGGTGCT TCACCTACAA GGGCCAGTGG AAG ATGTCCT ACTTCTACGC CGACCTGCTC 1020 
AGCAAGGAGA ACTGCTGGTC CAAGGCCACC TACATTTACA TGAAGGCCGC CTACCTCAGC 1080 
ATGTTTGGGA AGGAGG ACC A CAAGCCGTTC GGGG ACGACG AAGTGGAATT ATTTCGAGCT 1 140 
5 GTGCCAGGCC TGAAGCTCAA GATTGCTGGG AAATCTCTAC CCACAGAGAA GTTTGCCATC 1200 
CGGAAGTCCC GGCGCTACTT CTCCTCCAAC CCTATCTCGC TGCCAGTGCC TGCTCTGGAA 1260 
ATGATGTACA TCTGGAACGG CTACGCCGTG ATTGGGAAGC AGCCGAAACT CACGGATGGG 1320 
ATACTTGAGA TTATCACTAA GGCTGAAGAG ATGCTGGAGA AAGGCCCAGA GAACGAGTAC 1380 
TCAGTGGATG ACGAGTGCTT GGTGAAATTG TTGAAAGGCC TGTGTCTGAA ATACCTGGGC 1440 

10 CGTGTCCAGG AGGCCGAGGA GAATTTTAGG AGCATCTCTG CCAATGAAAA GAAGATTAAA 1500 
TATGACCACT ACTTGATCCC AAACGCCCTG CTGG AGCTGG CCCTGCTGCT TATGG AGCAA 1 560 
GACAGAAACG AAGAGGCCAT CAAACTTTTG GAATCTGCCA AGCAAAACTA CAAGAATTAC 1620 
TCCATGGAGT CAAGG ACACA CTTTCGAATC CAGGCAGCCA CACTCCAAGC CAAGTCTTCC 1680 
CTAGAGAACA GCAGCAGATC CATGGTCTCA TCAGTGTCCT TG TAG CTTTG TGCAGCAGTT 1740 

1 5 CCGGGCTGG A AGAC AG AG AC AGCTGGACAG AGCTCCTG A A AAC ATTTCAA AATACCCCCT 1 800 
CCCCCTGCCC TGCCCTGCCT TTGGGGTCCA CCGGCACTCC AGTTGGATGG CACAACATAG 1860 
TGTATCCGTG C AG A AGCCG A GCTGGCATTT TCACCAGTGT AGCCAAGGGC CTTTGCCAAG 1920 
GGCAGAGCAG GTGGAGCCCT CTGCCTGCCC TATCACACAT ACGGGTACTT GCTTTTCACT 1980 
GTGATGTTTA AGAGAATGTA TGAACAGTTT ACATTTTCCT TAGAAATACA TTG ATGGGAT 2040 

20 CACAGTTGGC TTTAAAAACC AACAACAATC AACCACCTGT AAGTCTTTGT CTTCACCTAT 2100 
TATCATCTGG AGGTAAATCT CTTTATATGA TGATGCCAAA GGGCAAATTG CTTTTCAAAT 2160 
TCAGC AAGTT CTCAGCTTGT GTG ACGG A AG GTCCTTCAGA GGACCTGAGG A ATGCCTGGG 2220 
AGAGGCTAAG CCTCACGCTT CAATGCTTCT GGGGTTGGGC ATGAGGATGT ACACAGACAC 2280 
CCACTACCTT ACTACTCACA CTTCATTTC A CTCCTTTTGT AAATTTCCAA TTTAAAAATC 2340 

25 AAGCACGTCT TTTTAGTGAG ATAAAATCTG AGCTCTTCTG TAGAAAAATC AATCTCTACC 2400 
AGTAGAAAAT GCCAGGGCTT GATGGAAGAG CTGTGTAGCC CTTTCTATGC CAAAGCCAGG 2460 
AAATTTGGGG GGCAGGAGGA GGTTCTCAGA ATCCAGTCTG TATCTTTGCT GTATGCCAAA 2520 
CTGAAACCAC TGGGAATAAT TTATGAAACA TAAAAATCTT CTGTACTTCA CTCCAAGGTA 2580 
CATTTGCTTA CTG ACAGCAT TTTTGTTAAA ACTGTTATTC TTGAAAAAAA AAAAAAAAAA 2640 

30 AA 

SEQ ID N0:191 BFG1 Protein sequence 

Protein Accession #: AAC39582 

35 

1 11 21 31 41 51 
i t ( I I i 

MTALDLFLTN QFSEALSYLK PRTKESMYHS LTYAT1LEMQ AMMTFDPQDI LLAGNMMKEA 60 
QMLCQRHRRK SSVTDSFSSL VNRPTLGQFT EEEIHAEVCY AECLLQRAAL TFLQDENMVS 120 

40 FIKGGIKVRN SYQTYKBLDS LVQSSQYCKG ENHPHFEGGV KLGVGAFNLT LSMLPTRILR 180 
LLEFVGFSGN KDYGLLQLEE GASGHSFRS V LCVMLLLCYH TFLTFVLGTG NVNIEEAEKL 240 
LKPYLNR YPK GAIFLFFAGR IEVIKGNIDA AIRRFEECCE AQQHWKQFHH MCYWELMWCF 300 
TYKGQWKMSY FYADLLSKEN CWSKATYIYM KAAYLSMFGK EDHKPFGDDE VELFRAVPGL 360 
KLKIAGKSLP TEKFA1RKSR RYFSSNPISL PVPALEMMYI WNGYAVIGKQ PKLTDGILEI 420 

45 ITKAEEMLEK GPENEYSVDD ECLVKLLKGL CLKYLGRVQE AEENFRSISA NEKK1KYDHY 480 
LIPNALLELA LLLMEQDRNE EAIKLLESAK QNYKNYSMES RTHFRIQAAT LQAKSSLENS 540 
SRSMVSSVSL 



50 SEQ ID NO:192 BF06 DNA sequence 

Nucleic Acid Accession #: NMJH2533 

Coding sequence: 1 -4044 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

55 | | | | 1 1 

ATgACTAGGA AG AGG ACATA CTGGGTGCCC AACTCTTCTG GTGGCCTCGT GAATCGTGGC 60 
ATCGACATAG GCGATGACAT GGTTTCAGGA CTTATTTATA AAACCTATAC TCTCCAAGAT 120 
GGCCCCTGGA GTCAGCAAG A GAGAAATCCT GAGGCTCCAG GGAGGGCAGC TGTCCCACCG 180 
TGGGGGAAGT ATGATGCTGC CTTGAGAACC ATGATTCCCT TCCGTCCCAA GCCGAGGTTT 240 

60 CCTGCCCCCC AGCCCCTGG A C AATGCTGGC CTGTTCTCCT ACCTCACCGT GTCATGGCTC 300 
ACCCCGCTCA TG A TCCAAAG CTTACGGAGT CGCTTAGATG AGAACACCAT CCCTCCACTG 360 
TCAGTCCATG ATGCCTCAGA CAAAAATGTC CAAAGGCTTC ACCGCCTTTG GGAAGAAGAA 420 
GTCTCAAGGC GAGGGATTGA AAAAGCTTCA GTGCTTCTGG TGATGCTGAG GTTCCAGAGA 480 
ACAAGGTTGA TTTTCGATGC ACTTCTGGGC ATCTGCTTCT GCATTGCCAG TGTACTCGGG 540 

65 CCAATATTGA TT AT ACC AAA GATCCTGGAA TATTCAGAAG AGCAGTTGGG GAATGTTGTC 600 
CATGGAGTGG GACTCTGCTT TGCCCTTTTT CTCTCCGAAT GTGTG AAGTC TCTGAGTTTC 660 
TCCTCCAGTT GG ATCATCAA CCAACGCACA GCCATCAGGT TCCGAGCAGC TGTTTCCTCC 720 
TTTGCCTTTG AGAAGCTCAT CCAATTTAAG TCTGTAATAC ACATCACCTC AGGAGAGGCC 780 
ATCAGCTTCT TCACCGGTGA TGTAAACTAC CTGTTTGAAG GGGTGTGCTA TGGACCCCTA 840 

70 GTACTGATCA CCTGCGCATC GCTGGTCATC TGCAGCATTT CTTCCTACTT CATTATTGGA 900 
TACACTGCAT TTATTGCCAT CTTATGCTAT CTCCTGGTTT TCCCACTGGC GGTATTCATG 960 
ACAAGAATGG CTGTGAAGGC TCAGCATCAC ACATCTGAGG TCAGCGACCA GCGCATCCGT 1020 
GTGACCAGTG AAGTTCTCAC TTGCATTAAG CTGATTAAAA TGTACACATG GGAGAAACCA 1080 
TTTGCAAAAA TCATTGAAGG TATGG AAAGT CTGACTTTCT GCTCCAAACC TGGTGATGGC 1 140 

7 5 ATGGCCTTCA GCATGCTGGC CTCCTTGAAT CTCCTTCGGC TGTCAGTGTT CTTTGTGCCT 1200 
ATTGCAGTCA AAGGTCTCAC GAATTCCAAG TCTGCAGTGA TGAGGTTCAA GAAGTTTTTC 1260 
CTCCAGGAGA GCCCTGTTTT CTATGTCCAG ACATTACAAG ACCCCAGCAA AGCTCTGGTC 1320 
TTTGAGGAGG CCACCTTGTC ATGGCAACAG ACCTGTCCCG GGATCGTCAA TGGGGCACTG 1380 
GAGCTGGAGA GGAACGGGCA TGCTTCTGAG GGGATGACCA GGCCTAGAGA TGCCCTCGGG 1440 
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CCAGAGGAAG AAGGGAACAG CCTGGGCCCA GAGTTGCACA AGATCAACCT GGTGGTGTCC 1500 
AAGGGGATGA TGTTAGGGGT CTGCGGCAAC ACGGGGAGTG GTAAGAGCAG CCTGTTGTCA 1560 
GCCATCCTGG AGGAGATGCA CTTGCTCGAG GGCTCGGTGG GGGTGCAGGG AAGCCTGGCC 1620 
TATGTCCCCC AGCAGGCCTG GATCGTCAGC GGGAACATCA GGGAG AACAT CCTCATGGG A 1680 
5 GGCGCATATG ACAAGGCCCG ATACCTCCAG GTGCTCCACT GCTGCTCCCT GAATCGGGAC 1740 
CTGGAACTTC TGCCCTTTGG AGACATGACA GAGATTGGAG AGCGGGGCCT CAACCTCTCT 1800 
GGGGGGCAGA AACAGAGGAT CAGCCTGGCC CGCGCCGTCT ATTCCGACCG TCAGATCTAC 1860 
CTGCTGGACG ACCCCCTGTC TGCTGTGG AC GCCCACGTGG GGAAGCACAT TTTTGAGGAG 1920 
TGCATTAAGA AGACACTCAG GGGGAAGACG GTCGTCCTGG TGACCCACCA GCTGCAGTAC 1980 

1 0 TTAGAATTTT GTGGCCAGAT CATTTTGTTG GAAAATGGGA AAATCTGTG A AAATGGAACT 2040 
CACAGTGAGT TAATGCAGAA AAAGGGGAAA TATGCCCAAC TTATCCAGAA GATGCACAAG 2100 
GAAGCCACTT CGG AC ATGTT GCAGGACACA GCAAAGATAG CAGAGAAGCC AAAGGTAGAA 2160 
AGTCAGGCTC TGGCCACCTC CCTGGAAG AG TCTCTCAACG GAAATGCTGT GCCGGAGCAT 2220 
CAGCTCACAC AGGAGGAGG A GATGGAAG AA GGCTCCTTGA GTTGGAGGGT CTACCACCAC 2280 

1 5 TAC ATCCAGG CAGCTGG AGG TTACATGGTC TCTTGCATAA TTTTCTTCTT CGTGGTGCTG 2340 
ATCGTCTTCT TAACGATCTT CAGCTTCTGG TGGCTGAGCT ACTGGTTGGA GCAGGGCTCG 2400 
GGGACCAATA GCAGCCGAG A GAGCAATGGA ACCATGGCAG ACCTGGGCAA CATTGCAGAC 2460 
AATCCTCAAC TGTCCTTCTA CCAGCTGGTG TACGGGCTCA ACGCCCTGCT CCTCATCTGT 2520 
GTGGGGGTCT GCTCCTCAGG GATTTTCACC AAAGTCACGA GGAAGGCATC CACGGCCCTG 2580 

20 CACAACAAGC TCTTCAACAA GGTTTTCCGC TGCCCCATGA GTTTCTTTGA CACCATCCCA 2640 
ATAGGCCGGC TTTTGAACTG CTTCGCAGGG GACTTGGAAC AGCTGGACCA GCTCTTGCCC 2700 
ATCTTTTCAG AGCAGTTCCT GGTCCTGTCC TTAATGGTGA TCGCCGTCCT GTTGATTGTC 2760 
AGTGTGCTGT CTCCATATAT CCTGTTAATG GGAGCCATAA TCATGGTTAT TTGCTTCATT 2820 
TATTATATGA TGTTC A AGAA GGCCATCGGT GTGTTCAAG A GACTGG AG AA CTATAGCCGG 2880 

25 TCTCCTTTAT TCTCCCACAT CCTCAATTCT CTGCAAGGCC TGAGCTCC AT CCATGTCTAT 2940 

GGAAAAACTG AAGACTTCAT CAGCCAGTTT AAGAGGCTGA CTGATGCGCA GAATAACTAC 3000 
CTGCTGTTGT TTCTATCTTC CACACGATGG ATGGCATTG A GGCTGG AG AT CATGACCAAC 3060 
CTTGTGACCT TGGCTGTTGC CCTGTTCGTG GCTTTTGGCA TTTCCTCCAC CCCCTACTCC 3120 
TTTAA AGTC A TGGCTGTCA A C ATCGTGCTG C AGCTGGCGT CC AGCTTCC A GGCC ACTGCC 3 1 80 

30 CGGATTGGCT TGG AG AC AG A GGCACAGTTC ACGGCTGTAG AGAGGATACT GCAGTACATG 3240 
AAGATGTGTG TCTCGGAAGC TCCTTTACAC ATGGAAGGCA CAAGTTGTCC CCAGGGGTGG 3300 
CCACAGCATG GGGAAATC AT ATTTCAGGAT TATCACATG A AATACAGAGA CAACACACCC 3360 
ACCGTGCTTC ACGGCATCAA CCTGACCATC CGCGGCCACG AAGTGGTGGG CATCGTGGGA 3420 
AGGACGGGCT CTGGGAAGTC CTCCTTGGGC ATGGCTCTCT TCCGCCTGGT GGAGCCCATG 3480 

35 GCAGGCCGGA TTCTCATTGA CGGCGTGGAC ATTTGCAGCA TCGGCCTGGA GGACTTGCGG 3540 
TCCAAGCTCT CAGTGATCCC TCAAGATCCA GTGCTGCTCT CAGGAACCAT CAGATTCAAC 3600 
CTAGATCCCT TTGACCGTCA CACTGACC AG CAG ATCTGGG ATGCCTTGGA GAGG ACATTC 3660 
CTGACCAAGG CCATCTCAAA GTTCCCCAAA AAGCTGCATA CAGATGTGGT GGAAAACGGT 3720 
GGAAACTTCT CTGTGGGGGA GAGGCAGCTG CTCTGCATTG CCAGGGCTGT GCTTCGCAAC 3780 

40 TCCAAGATCA TCCTTATCGA TGAAGCCACA GCCTCCATTG ACATGGAGAC AGACACCCTG 3840 
ATCCAGCGCA CAATCCGTGA AGCCTTCCAG GGCTGCACCG TGCTCGTCAT TGCCCACCGT 3900 
GTCACCACTG TGCTGAACTG TGACCACATC CTGGTTATGG GCAATGGGAA GGTGGTAGAA 3960 
TTTGATCGGC CGGAGGTACT GCGGAAGAAG CCTGGGTCAT TGTTCGCAGC CCTCATGGCC 4020 
ACAGCC ACTT CTTC ACTGAG ATAAGGAG AT GTGGAGACTT CATGG AGGCT GGCAGCTGAG 4080 

45 CTCAGAGGTT CACACAGGTG CAGCTTCGAG GCCCACAGTC TGCGACCTTC TTGTTTGGAG 4140 
ATGAGAACTT CTCCTGGAAG CAGGGGTAAA TGTAGGGGGG GTGGGG ATTG CTGGATGGAA 4200 
ACCCTGG AAT AGGCTACTTG ATGGCTCTCA AGACCTTAGA ACCCCAGAAC CATCTAAGAC 4260 
ATGGGATTC A GTGATCATGT CGTTCTCCTT TTAACTTACA TGCTG AATAA TTTTATAATA 4320 
AGGTAAAAGC TTATAGTTTT CTGATCTGTG TTAGAAGTGY TGCAAATGCT GTACTGACTT 4380 

50 TGTAAAATAT AAAACTAAGG AAAACTCAAA AAAAAAAAAA AAAAAAA 

$EQ ID HQ;m PFpg Profon seqyence 

Protein Accession #: NP_1 1 5972.1 

55 1 U 21 31 41 51 
I I I I I I 

MTRKRTYWVP NSSGGLVNRG IDIGDDMVSG L1YKTYTLQD GPWSQQERNP EAPGRAAVPP 60 
WGKYDAALRT MIPFRPKPRF PAPQPLDN AG LFSYLTVSWL TPLMIQSLRS RLDENTIPPL 1 20 
SVHDASDKNV QRLHRLWEEE VSRRGIEKAS VLLVMLRFQR TRLIFDALLG ICFCIASVLG 180 

60 PILIIPKILE YSEEQLGNVV HGVGLCFALF LSECVKSLSF SSSWIDMQRT AIRFRAAVSS 240 
FAFEKLIQFK SVIHITSGEA ISFFTGDVNY LFEGVCYGPL VLITCASLVI CSISSYRIG 300 
YTAFIAILCY LLVFPLAVFM TRMAVKAQHH TSEVSDQRIR VTSEVLTCIK LIKMYTWEKP 360 
FAKIIEGMES LTFCSKPGDG MAFSMLASLN LLRLSVFFVP IAVKGLTNSK SAVMRFKKFF 420 
LQESPVFYVQ TLQDPSKALV FEEATL5WQQ TCPGIVNGAL ELERNGHASE GMTRPRDALG 480 

65 PEEEGNSLGP ELHKINLVVS KGMMLGVCGN TGSGKSSLLS AILEEMHLLE GSVGVQGSLA 540 
YVPQQAWIVS GNIRENILMG GAYDKARYLQ VLHCCSLNRD LELLPFGDMT EIGERGLNLS 600 
GGQKQRISLA RAVYSDRQIY LLDDPLSAVD AHVGKHIFEE CIKKTLRGKT VVLVTHQLQY 660 
LEFCGQIILL ENGKICENGT HSELMQKKGK YAQUQKMHK EATSDMLQDT AKIAEKPKVE 720 
SQALATSLEE SLNGNAVPEH QLTQEEEMEE GSLSWRVYHH YIQAAGGYMV SCIIFFFVVL 780 

70 IVFLTIFSFW WLSYWLEQGS GTNSSRESNG TMADLGNIAD NPQLSFYQLV YGLNALLLIC 840 
VG VCSSGIFT KVTRKASTAL HNKLFNKVFR CPMSFFDTIP IGRLLNCFAG DLEQLDQLLP 900 
IFSEQFLVLS LMVIAVLLIV SVLSPYILLM GAIIMVICFI YYMMFKKAIG VFKRLENYSR 960 
SPLFSHILNS LQGLSSIHVY GKTEDFISQF KRLTDAQNNY LLLFLSSTRW MALRLEIMTN 1020 
LVTLAVALFV AFGISSTPYS FKVMAVNIVL QLASSFQATA RIGLETEAQF TAVERILQYM 1080 

75 KMCVSEAPLH MEGTSCPQGW PQHGEIIFQD YHMKYRDNTP TVLHGINLTI RGHEVVGIVG 1 140 
RTGSGKSSLG MALFRLVEPM AGRILIDGVD ICSIGLEDLR SKLSVIPQDP VLLSGTIRFN 1200 
LDPFDRHTDQ QIWDALERTF LTKAISKFPK KLHTDVVENG GNFSVGERQL LCIARAVLRN 1260 
SKJILIDEAT ASIDMETDTL IQRTIREAFQ GCTVLVIAHR VTTVLNCDHI LVMGNGKVVE 1320 
FDRPEVLRKK PGSLFAALMA TATSSLR 
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SEQ ID NO:194 B HB8 DNA sequence 



Nucleic Acid Accession #: 
Coding sequence: 



AA983251 

1-1749 (underlined sequences correspond to start and stop codons) 



1 



11 



21 



31 



41 



51 



ATGCTGTCTG GCTTCTTGAT GAGTCCCAGT ACCCAGCACA GAGCACAGTA CACTCCCGGA 60 

GGAAAGAAAC TTCCGTGGGA GGCTTCCATC GGTGCGCACA CCTCCCGAGG GCGAGGCAGC 120 

GACCGGGAGA GGGAGAGCCG GCCGGAGGCT GCCGGGCTCC TGTGGGACCG CGCTGCAGCC 180 

GGGGAGGCGG AGAAGGGGAA CCGGGGCGAG CCGCCCGCCT GGATCCGCGC CCAGCAGCAG 240 

CCGCGGCCGC CGCCAGCTGG GCAGGCTCCC GGGACTGCGG CTGGGGGCGC GCAGGACCCT 300 

CGCCTGCGTC CTGGACGTTC CCGGGGGAGG GTCCGGTTGC CAGTGAAACC TCCAGAGGCT 360 

TCCGGACGAC AGCCCCGGGG GCCTTCTGAC TGCATCCCGA GATTTCCATC AGCGAGTGCA 420 

ACTCATAAGG CAGTCCCTAA GGGGACCGGG CCACCGGCTG AGGACGGGGA TGGCTTAGGA 480 

GCTCCTGGAC CTAGGGCCCG GCGTCGTCGC CTCCTGGGCG TCGCGGCAGA GGGGAGTGGC 540 

CCGCGCGGAA AGCGCCGCGG GACAGTCAGT GACGAGGCCC GGGGGTCGCC GGGGCCACGA 600 

CTTCTCGGAG ACCGTCCTGC GCTCTCTGGA GACGCGCTGT CCGCGCCCAG GGTGGTGCCA 660 

TGTGGGGCGC TCGCCGCTCG TCCGTCTCCT CATCCTGGAA CGCCGCTTCG CTCCTGCAGC 720 

TGCTGCTGGC TGCGCTGCTG GCGGCGGGGG CGAGGGCCCA GCGGCGAGTA CTGCCACGGC 780 

TGGCTGGACG CGCAGGGCGT CTGGCGCATC GGCTTCCAGT GTCCCGAGCG CTTCGACGGC 840 

GGCGACGCCA CCATCTGCTG CGGCAGCTGC GCGTTGCGCT ACTGCTGCTC CAGCGCCGAG 900 

GCGCGCCTGG ACCAGGGCGG CTGCGACAAT GACCGCCAGC AGGGCGCTGG CGAGCCTGGC 960 

CGGGCGGACA AAGACGGGCC CCGACGGCTC GGCAGGGCTT CATGTCTTAG GGGTACCCAA 1020 

GGAGACGGCG AGGGTGCGCC CCCACCCGTG AGGGCCTGGC AGCGGTGCTC CCCTGAAGGC 1080 

TCCCCGAAAG GAAGGCAGCT CCTCAGGGCT TTCCCGGGGC TGCTGCCCCG TGCCAGACGC 1140 

CGCGGATTCC CATCTTCTCC ACGCGGCGGC CCCTCTCCCC TGCAGCGGCC CGCCTTGCCC 1200 

ATCTACGTGC CGTTCCTCAT TGTTGGCTCC GTGTTTGTCG CC TTTATC AT CTTGGGGTCC 1260 

CTGGTGGCAG CCTGTTGCTG CAGATGTCTC CGGCCTAAGC AGGATCCCCA GCAGAGCCGA 1320 

GCCCCAGGGG GTAACCGCTT GATGGAGACC ATCCCCATGA TCCCCAGTGC CAGCACCTCC 1380 

CGGGGGTCGT CCTCACGCCA GTCCAGCACA GCTGCCAGTT CCAGCTCCAG CGCCAACTCC 1440 

GGGGCCCGGG CGCCCCCAAC AAGGTCACAG ACCAACTGTT GCTTGCCGGA AGGGACCATG 1500 

AACAACGTGT ATGTCAACAT GCCCACGAAT TTCTCTGTGC TGAACTGTCA GCAGGCCACC 1560 

CAGATTGTGC CACATCAAGG GCAGTATCTG CATCCCCCAT ACGTGGGGTA CACGGTGCAG 1620 

CACGACTCTG TGCCCATGAC AGCTGTGCCA CCTTTCATGG ACGGCCTGCA GCCTGGCTAC 1680 

AGGCAGATTC AGTCCCCCTT CCCTCACACC AACAGTGAAC AGAAGATGTA CCCAGCGGTG 1740 

ACTGT ATAA C CGAGAGTCAC TGGTGGGTTC CTTTACTGAA GGGAGACGAA GGCAGGGGTG 1800 

GATTCTCGAG GTGGAAGTCC GCACATGTCG GTGGTATTTA TGGCACGATT CCTTTGGATG 1860 

GCTTCATTTG CCCCCAGACT GTATGAAAAC ATCTCCGAAT TAGCATTTCT GGATATGTTT 1920 

CATCCAGGGT ATCATTGATT TATGATGGAA AACCGGCCTC AGCTGGAGAT GACTGTGATG 1980 

TTGCTGATGG GTGTATAACA AATGCTTGAG TCCGAAGTGC CCTTGAGATA TGGTTGACGA 2040 

AAGAATTTTA TAAACTGATA AATTAAGGAT TTTTATTATG TTG TT ATT AT TATTTCTTTT 2100 

TTGTTGTTGA CTGCACAGGA TCAAAATGCC TGTTATCTCC CTTTTACTGG GACTTTTTTT 2160 

TTTTTTTTTT TTTTTTTTAA TCAGACAGGG TCTTGCTCTG TTGCCCAGGC TGGAGTGCAG 2220 

TGGTGCGATC TCGGCTCACT GCAACTTCAG CCTCCTGGAT TCAGGCAACA CTCCTGCCTC 2280 

AGCCTCCCAC GTGGCTGGGA TTACAGGTGC CTGCCCCCAT GGCTAATTTT TTGTATTTTT 2340 

TGTAGAGATG GGGTTTCACC ATGTTGGCTG GGCTGGTCTC ACTCTCCTGA CCTCAAGCAA 2400 

TCTGCCTGTC TCAGCCTCCC AAAGTGCTGG GATTACAGGC GTGAGCCACC GCCCCCAGCC 2460 

TGAGCCTTTT TTTTTTTCTA ATGCATCCAA GGTTAAGGGG AAGACGCAAA TAACAGGACT 2520 

ATTCTAAAAG GAAACCTGTT TGAACTCTGT GAGATCAGTC ATCAGTCTCA GTATTCCACA 2580 

GGCACACCTT AATTTCATTG TAAAAAGATA TATATATTTT GTCTATTTTT GTGCTTTTGG 2640 

GGGCCTATTT TGTGCTTTTT TACCTTATGT AGAGATCTTA TTACAAAGTG ATTTTCTACA 2700 

TTAAAAAGAG ACTGAAATAA ATTGTATAGT TACTTAACTA ATGAAGACAT TTCAGAACTC 2760 

TGGGATGATT TTAATCTTGA AGTAGTAGGT GGTATAGTCA TAAAACCATT CATCCCCTTC 282 0 

TTGATTGTAT CTTAATTTTC TGGCTTTAAG GTGACATCTG AGAGGTAATG CATTCTTTTT 2880 

TATATTGAAA TC AT AAAC T A TCACCCGCTG CTTCTCTGAG TTACTTTTAA TTTTGCCTTG 2940 

TGGTTATGGT TTGGCGTTTC CTTCTGTTTG GTTTTCAGAG CCCCATGTCT ATATAGTCCT 3000 

GAGTGCAAGT AATTACTATA CTTGTAAATG AAGATCAGTA TTTCTGCCTA GATCTGATAA 3060 

AAAAATTTTC TTGTCTTAGT TATAAAAATT CAAAGAAATG TGTTACAAAG AT AC TT AGT A 3120 

TAGCTCCTCA GCCATAACCT GAGACTTGGG ATGAAATTTA AACCAGATAC GATTTACTTT 3180 

GCAGATCATA AGGCTTTTTA TACTCTTGTT ATCAAAATGG CTTATTTTTC AGGCACTAAG 3240 

GATTGTTAAG AGAAAAGCTT TTCAACGAAG GATTGCCTTT CTTCTCCCAC ACTGTTCTTG 3300 

ATTTCCTCTC TCTTTCAGGC CTCAACAGGC ACTGTATTCA TTGCCAATGT TCCAAATTAT 3360 

CAAATTCAAG TGAATTTATT TGTGTGTTCT TTACTTATAT AAAAAAAGAT AACTTTAAGG 3420 

ATGTGCAAGT ACATTTCCAA CTGCTAGCAC AACCAGTATT TTGTAATTAA ACAAATCGCT 3480 

GTATGGTATG GTCTTCTACA CATTTATGTC TATAGATATC TATCGATCAT CTTTCTATTC 3540 

TGTTTCATGA CTGAATAATG TAAAACCAGT GTTGGCAATT GGTATCATCA ATGATACTCA 3600 

TTTTTTAATA ACCAAAGGCA GGGGAAAATC ATTTT AC TT A TTAATAAATA TTTTATGATG 3660 
TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA 
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Attorney Docket No.: OI8501-004200US 



SEQ IP NO:195 BHB8 Protein sequence 

Protein Accession #: none found 

5 

1 11 21 31 41 51 

I ! I I I I 

MLSGFLMSPS TQHRAQYTPG GKKLPWEASI GAHTSRGRGS DRERESRFEA AGLLWDRAAA 60 

GEAEKGNRGE PPAWIRAQQQ PRPPPAGQAP GTAAGGAQDP RLRPGRSRGR VRLPVKPPEA 120 
10 SGRQPRGPSD CIPRFPSASA THKAVPKGTG PPAEDGDGLG APGPRARRRR LLGVAAEGSG 180 

PRGKRRGTVS DEARGSPGPR LLGDRPALSG DALSAPRWP CGALAARPSP HPGTPLRSCS 240 

CCWLRCWRRG RGPSGEYCHG WLDAQGVWRI GFQCPERFDG GDATICCGSC ALRYCCSSAE 300 

ARLDQGGCDN DRQQGAGEPG RADKDGPRRL GRASCLRGTQ GDGEGAPPPV RAWQRCSPEG 360 

SFKGROLLRA FPGLLPRARR RGFPSSPRGG PSPLQRPALP IYVPFLIVGS VFVAFIILGS 420 
15 LVAACCCRCL RPKQDPQQSR APGGNRLMET IPMIPSASTS RGSSSRQSST AASSSSSANS 480 

GARAPPTRSQ TNCCLPEGTM NNVYVNMPTN FSVLNCQQAT QIVPHQGQYL HPPYVGYTVQ 540 

HDSVPMTAVP PFMDGLQPGY RQIQSPFPHT NSEQKMYPAV TV 

SEQ ID NO:196 CQA5 DNA SEQUENCE 

20 Nucleic Acid Accession*: AA083458 

Coding sequence: 862-1995 (underlined sequences correspond to start and stop codons) 



= 1 11 21 31 41 51 

.325 j | | j t I 

I GCCCTTGGAC ACTGACATGG ACTGAAGGAG TAGAATGGAG CACGAGGACA CTGACATGGA 60 

CTGAAGAAAA AGGAGCTGGA GCAGGAGAAG GAGGTGCTGC TGCAGGGTTT GGAGATGATG 120 

GCGCGGGGCC GCGACTGGTA CCAGCAGCAG CTGCAACGAG TGCAGGAGCG CCAGCGCCGC 180 

CTGGGCCAGA GCAGAGCCAG CGCCGACTTT GGGGCTGCAG GGAGCCCCCG CCCACTGGGG 240 

30 CGGC TACTGC CCAAGGTACA AGAGGTGGCC CGGTGCCTGG GGGAGCTGCT GGCTGCAGCC 300 

TGTGCCAGCC GGGCCCTGCC CCCGTCCTCC TCCGGGCCCC CCTGCCCTGC CCTGACGTCC 360 

ACCTCACCCC CGGTCTGGCA GCAGCAGACC ATCCTCATGC TGAAGGAGCA GAACCGACTC 420 

CTCACCCAGG AGGTGACCGA GAAGAGTGAG CGCATCACGC AGCTGGAGCA GGAGAAGTCG 480 

; GCGCTCATTA AGCAGCTGTT TGAGGCCCGC GCCCTGAGCC AGCAGGACGG GGGACCTCTG 540 

35 GATTCCACCT TCATCTAGTC CTTGTGGGCC GCGTGGGCCC CCAGGGCCAG CCTGGCACTC 600 

AGCCCTTCGA GGGTGGGCGC CCCATCGCAC CCACCCTCTC TGGCTGGAGA CCCCCGGCAG 660 

GCCCAGGCAC AGTCCCGGAG TGGGCGCCTT CCTGCCGCCC TTGCCAGATG GGCTCCCCAG 720 

GCCTGCCCCC GGCTGGTCCC CGCACCGAGC GCTTGACTCC GTTTKGGCTC CTGGTTGYTG 780 

ACATGGGCTG GGGGCTCTCT TGAGTCCGCA TAGTCCGCAG CTACTACTGG CCGCTGTCAG 840 

40 TGGACAGTGG GGTACCCCTC C ATG AGTTAG CGTCCCCCCG TTTCCAGCGG TGCCGCCCTG 900 

GGTCCCATCT TCAGGGAAAG GCACTGCCCA CGCCAGGCTG CACTTCCAAC AACGGGCAGC 960 

AGAGGGCGCG GGGCGGCTCC GACGCGGGTC CAAGGGCAGC TTCCCGCTCA ACCAGGGCAC 1020 

CAGGACGAGG TGGCTGTAGC TCGGACGGAC GGAAGTAGAT GGAGGGGGTG GGGACGGCCT 1080 

GTAAGCGGGG GGTGCCTGCC TGGCTGGGGA GCCCCAGGGA TAGCGGTCGG ACTTCAGGTT 1140 

45 CTGGCCAAGG CTGAGGGACC CTGGCTGCAG CGGATCGGCA CGCCGGGTGG GCGAGAGCTT 1200 

GGCCTGCATG TGCCTCCCAC AGACCCTGGG GTGATGGCCT TCCCCCTCTT GGCCGGGACG 1260 

TTGCCCCACG TTGAGTCCCA CACAACATCC TGTGAGCCTG GCTCCCCAGG AGGGCCCCCA 1320 

GACAGCTCCC AGGCACGTCA TAGGCAAAGC CTGTTTCCCC CGACTCAGGA TTTCCAAGGC 1380 

c CTGGGGTCCT GCTCACCCCC CTTTGCTCTC ACGCCCAGCC TGTCCCCAGG TTTCAGCTGG 1440 

50 GAGAGGCCAC CTCOCTCAGC CAAGGAAAAC GAGAACCCCC AGGGTACAGG AGGAGGCTGG 1500 

> GGCAGGTCCC CTTGGGTGTC ACTCCCTCAG CCCCTGCCCA GGCCCACTCC CGCTGGTGCT 1560 

GGAGTACGCA CTGGTGGGGG GGCCCTGCTC AGCCCAACCT GGAGGGTCCC AGTGTCACCA 1620 

GAACCAGGGG CACGGC AACA GCATCGATGG GTTCTGCAGC CCAGGGCCCC CGATGCGGGG 1680 

TCAGTGTGTG TGGGGCGCAG GGCCTCCGAT GCGGGGTCAG TGCGTGGGGG GCGCAGGGCC 1740 

55 CCCGATGCGG GGTCAGTGCG TGGGGGGCGC AGGGCCCCCT CGTGTCCAGG GCACTTTGGT 1800 

ACACTGTCCC ACAAGGCACC TGTCTCAGAG GAGGGGCCCT GGCAGGCAGC GTGGCAACTC I860 

CCTTCCGGAG CCCAGCTCCA TGCTAACCTG CCCACAGCAA CCCCACAGAG CCACATTCCC 1920 

TGCTGCACCT GGTCTGCAGG GGTGTCCCAG GACAGGC CCA AGTCAGCCCA GCATGCAGCT 1980 

GCCCTCCTAC CCTGAAGATG GGAGTGGGCT TTCCAGGGGA CATAAGGATG TCAGGCCTGG 2040 

60 ACCTCCTGGG CAGGAAAGGG TGCAGGTCCT GAGGGCCTGT GCCCCACAGC CCCAGCACCC 2100 

AGGTGGACTG CAGCGCAGTG GGTGGGCCAG TGGCAGCCAG GGAGAAGCCC CCCGTCAGCA 2160 

GGCTGGGGTC TGCCCACCAG GGCCTCCCCA CGTCTGCCTT TGAGGGTGCC TGCCATGCCC 2220 

TGGGGGATCC TGGCATCTTT ACTGGACTGG AAGCAGGAGA CAGAACAGTG TCTGTCCCGG 2280 

GGTGACTTCA TCAGGAGACC GCCCACATAG AGCTGGACCC CGCAGCTGAA GCGGAAATGT 2340 

65 GAGACAGGCT GGCACCTCCG GAAAAACTGC CTTTCAGCCT TGGTGTTCCG TGCAAGGTGA 2400 

AAAGAAATAG GTCCTCCCAG TTTACAGCTT GAAATCAGGC TAGTGAGTGG CCCTGGAGAC 2 460 

CACGAGGGGA GAATTTAAAG GCCCCGGCTG GCAGGGTCTA GGTGGCTGGC AGAGGCACAT 2520 

GCAGACCCTG CCTGGAGCCT GCCCTAGGAC GCTGGGCGGG TCAGTCTCCG TGCAGGATGT 2580 

_ A GAGCAGCGTC CCTGGGCTCT ATCCGCGAGG TGCCAGTAGC GTGTGCAGGT ACATACACGT 2640 

70 GCGTGCACAC TGTGATGACA CCCGGAAATG TCTCAGGATG TTGAAATGTG TCCTTGGGGG 2700 

CAGAAGTGTC CCCAGTTGAG AATCTGCCCC AGAGGAACAC ACCCACACCA GGCCTCAGGA 2760 

TTTTGTGTTG ATCAAGTTCC AAGGAAAAGG AACATCTCAG CCGGGCGTGG TGGTTCACGC 2820 

CTGGAATCCC AGCACTTGAG GCCAGGAGTT CCAGAGCAGC CTGGGCAACG CAGTGAGAGA 2880 

CCCCATCTCT ACAARAAAAA AAAAAGAAAG AAAGAAAATG AGAGATCCAG GTTTAAAAAT 2940 

75 TCATAAACAC CACAAGGAAA CAATACACTA TGAGACCCAG CAGAAGCAAC AGATTGACTC 3000 

TAGACCCAGA TACTAGAATT ATCAGAGAGA ATATAAAGTA ACAGTGTTTT ATATATCTAA 3060 
AGAAATAAAA GAGATTTCTG GAAACATGAA AAAAAA 
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SEQ ID NO:197 LBG2 ONA SEQUENCE 

Nucleic Acid Accession #: X63629 

Coding sequence: 54-2543 (start and slop codons are underlined) 

5 1 11 21 31 41 51 

i f 1 I ! I 

GCGGAAC ACC GGCCCGCCGT CGCGGCAGCT GCTTCACCCC TCTCTCTGCA GCCATGGGGC 60 
TCCCTCGTGG ACCTCTCGCG TCTCTCCTCC TTCTCCAGGT TTGCTGGCTG CAGTGCGCGG 1 20 
CCTCCGAGCC GTGCCGGGCG GTCTTCAGGG AGGCTGAAGT GACCTTGGAG GCGGGAGGCG 180 
1 0 CGGAGCAGG A GCCCGGCC AG GCGCTGGGGA AAGTATTCAT GGGCTGCCCT GGGCAAGAGC 240 
CAGCTCTGTT TAGCACTGAT AATGATGACT TCACTGTGCG GAATGGCGAG ACAGTCCAGG 300 
AAAGAAGGTC ACTGAAGGAA AGGAATCCAT TG AAGATCTT CCCATCCAAA CGTATCTTAC 360 
GAAGACACAA G AGAG ATTGG GTGGTTGCTC CAATATCTGT CCCTGAAAAT GGCAAGGGTC 420 
CCITCCCCCA G AGACTGAAT CAGCTCAAGT CTAATAAAGA TAG AGACACC AAGATTTTCT 480 
1 5 ACAGCATCAC GGGGCCGGGG GCAGACAGCC CCCCTGAGGG TGTCTTCGCT GTAG AGAAGG 540 
AG ACAGGCTG GTTGTTGTTG AATAAGCCAC TGGACCGGGA GGAGATTGCC AAGTATGAGC 600 
TCTTTGGCCA CGCTGTGTCA GAGAATGGTG CCTCAGTGGA GGACCCCATG AACATCTCCA 660 
TCATCGTGAC CG ACC AG A AT GACCACAAGC CCAAGTTTAC CCAGGACACC TTCCGAGGGA 720 
GTGTCTTAG A GGGAGTCCTA CCAGGTACTT CTGTG ATGCA GGTG ACAGCC ACAGATGAGG 780 
20 ATGATGCCAT CTACACCTAC AATGGGGTGG TTGCTTACTC CATCCATAGC CAAGAACCAA 840 
AGGACCCACA CGACCTCATG TTCACAATTC ACCGG AGCAC AGGCACCATC AGCGTCATCT 900 
CCAGTGGCCT GG ACCGGG A A AAAGTCCCTG AGTACACACT GACC ATCCAG GCCACAGACA 960 
TGGATGGGGA CGGCTCCACC ACCACGGCAG TGGCAGTAGT GGAGATCCTT GATGCCAATG 1020 
AC AATGCTCC CATGTTTGAC CCCCAG AAGT ACG AGGCCCA TGTGCCTG AG AATGCAGTGG 1080 
25 GCCATG AGGT GCAGAGGCTG ACGGTCACTG ATCTGGACGC CCCCAACTCA CC AGCGTGGC 1 140 
GTGCCACCTA CCTTATCATG GGCGGTGACG ACGGGGACCA TTTTACCATC ACCACCCACC 1200 
CTGAGAGCAA CCAGGGCATC CTGACAACCA GGAAGGGTTT GGATTTTGAG GCCAAAAACC 1260 
AGCACACCCT GTACGTTGAA GTGACCA ACG AGGCCCCTTT TGTGCTG AAG CTCCC AACCT 1 320 
. _ CCACAGCCAC CATAGTGGTC CACGTGGAGG ATGTGAATGA GGCACCTGTG TTTGTCCCAC 1380 
30 CCTCCAAAGT CGTTGAGGTC CAGGAGGGCA TCCCCACTGG GGAGCCTGTG TGTGTCTACA 1440 
CTGCAGAAGA CCCTGACAAG GAGAATCAAA AGATCAGCTA CCGCATCCTG AGAGACCCAG 1500 
CAGGGTGGCT AGCCATGGAC CCAGACAGTG GGCAGGTCAC AGCTGTGGGC ACCCTCGACC 1560 
GTGAGGATGA GCAGTTTGTG AGGAACAACA TCTATG AAGT CATGGTCTTG GCCATGGACA 1620 
ATGGAAGCCC TCCCACCACT GGCACGGGAA CCCTTCTGCT AACACTGATT GATGTCAACG 1680 
35 ACCATGGCCC AGTCCCTGAG CCCCGTCAGA TCACCATCTG CAACCAAAGC CCTGTGCGCC 1740 
ACGTGCTG AA CATCACGG AC AAGGACCTGT CTCCCCAC AC CTCCCCTTTC CAGGCCCAGC 1 800 
TCACAGATGA CTCAG ACATC TACTGG ACGG C AG AGGTC AA CG AGG A AGGT G ACACAGTGG 1 860 
TCTTGTCCCT GAAGAAGTTC CTGAAGCAGG ATACATATGA CGTGCACCTT TCTCTGTCTG 1920 
ACCATGGCAA CAAAGAGCAG CTGACGGTGA TCAGGGCCAC TGTGTGCGAC TGCCATGGCC 1980 
40 ATGTCGAAAC CTGCCCTGGA CCCTGGAAAG GAGGTTTCAT CCTCCCTGTG CTGGGGGCTG 2040 
TCCTGGCTCT GCTGTTCCTC CTGCTGGTGC TGCTTTTGTT GGTGAGAAAG AAGCGGAAGA 2100 
TCAAGGAGCC CCTCCTACTC CCAGAAGATG ACACCCGTGA CAACGTCTTC TACTATGGCG 2160 
AAGAGGGGGG TGGCGAAGAG GACCAGGACT ATGACATCAC CCAGCTCCAC CGAGGTCTGG 2220 
AGGCCAGGCC GGAGGTGGTT CTCCGCAATG ACGTGGCACC AACCATCATC CCGACACCCA 2280 
45 TGTACCGTCC TAGGCCAGCC AACCCAG ATG AAATCGGCAA CTTTATAATT G AG A ACCTG A 2340 
AGGCGGCTAA CACAGACCCC ACAGCCCCGC CCTACGACAC CCTCTTGGTG TTCGACTATG 2400 
AGGGCAGCGG CTCCG ACGCC GCGTCCCTGA GCTCCCTCAC CTCCTCCGCC TCCG ACC AAG 2460 
ACCAAGATTA CG ATTATCTG AACGAGTGGG GCAGCCGCTT CAAGAAGCTG GCAGACATGT 2520 
ACGGTGGCGG GGAGGACGAC TAG GCGGCCT GCCTGCAGGG CTGGGGACCA AACGTCAGGC 2580 
50 CACAGAGCAT CTCCAAGGGG TCTCAGTTCC CCCTTCAGCT GAGGACTTCG GAGCTTGTCA 2640 
GGAAGTGGCC GTAGC AACTT GGCGGAGACA GGCTATGAGT CTGACGTTAG AGTGGTTGCT 2700 
TCCTTAGCCT TTCAGGATGG AGGAATGTGG GCAGTTTGAC TTCAGCACTG AAAACCTCTC 2760 
CACCTGGGCC AGGGTTGCCT CAGAGGCCAA GTTTCCAGAA GCCTCTTACC TGCCGTAAAA 2820 
TGCTCAACCC TGTGTCCTGG GCCTGGGCCT GCTGTGACTG ACCTACAGTG GACTTTCTCT 2880 
CTGG AATGGA ACCTTCTTAG GCCTCCTGGT GCAACTTAAT TTTTTTTTTT AATGCTATCT 2940 
TCAAA ACGTT AGAG AAAGTT CTTCAAAAGT GCAGCCC AGA GCTGCTGGGC CCACTGGCCG 3000 
TCCTGCATTT CTGGTTTCCA GACCCCAATG CCTCCCATTC GGATGGATCT CTGCGTTTTT 3060 
ATACTGAGTG TGCCTAGGTT GCCCCTTATT TTTTATTTTC CCTGTTGCGT TGCTATAGAT 3120 
GAAGGGTGAG GACAATCGTG TATATGTACT AGAACTTTTT TATTAAAGAA A 



55 

60 
65 



SEQ ID NO:198 LBG2 Protein sequence: 

Protein Accession #: CAA45177 



1 II 21 31 41 51 



MGLPRGPLAS LLLLQVCWLQ CAASEPCRAV FREAEVTLEA GGAEQEPGQA LGKVFMGCPG 60 
QEPALFSTDN DDFTVRNGET VQERRSLKER NPLKIFPSKR ILRRHKRDWV V APIS V PENG 120 

70 KGPFPQRLNQ LKSNKDRDTK IFYSITGPGA DSPPEGVFAV EKETGWLLLN KPLDREEIAK 180 
YELFGHAVSE NGASVEDPMN ISIIVTDQND HKPKFTQDTF RGSVLEGVLP GTSVMQVTAT 240 
DEDDAIYTYN G VVAYSIHSQ EPKDPHDLMF TIHRSTGTIS VISSGLDREK VPEYTLTIQA 300 
TDMDGDGSTT TAVAVVE1LD ANDNAPMFDP QKYEAHVPEN AVGHEVQRLT VTDLDAPNSP 360 
AWRATYUMG GDDGDHFTIT THPESNQGIL TTRKGLDFEA KNQHTLYVEV TNEAPFVLKL 420 

7 5 PTSTAT1 VVH VEDVNEAPVF VPPSKVVEVQ EGIPTGEPVC VYTAEDPDKE NQK1SYRILR 480 

DPAGWLAMDP DSGQVTAVGT LDREDEQFVR NNIYEVMVLA MDNGSPPTTG TGTLLLTLID 540 
VNDHGPVPEP RQITICNQSP VRHVLNITDK DLSPHTSPFQ AQLTDDSDIY WTAEVNEEGD 600 
TVVLSLKKFL KQDTYDVHLS LSDHGNKEQL TVIRATVCDC HGHVETCPGP WKGGF1LPVL 660 
GAVLALLFLL LVLLLLVRKK RKIKEPLLLP EDDTRDN V FY YGEEGGGEED QDYDITQLHR 720 



384 



GLEARPEVVL RNDV APTIIP TPMYRPRPAN PDEIGNFIIE NLKAANTDPT APPYDTLLVF 780 
DYEGSGSDAA SLSSLTSS AS DQDQDYDYLN EWGSRFKKLA DMYGGGEDD 

SEQ ID NO:199 OB15 DNA SEQUENCE 

Nucleic Acid Accession #: NM_01 21 52 

Coding sequence: 43-1 1 04 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

i I I I 1 1 

CTTCTTTAAA TTTCTTTCTA GGATGTTCAC TTCTTCTCCA CAATGAATGA GTGTCACTAT 60 

GACAAGCACA TGGACTTTTT TTATAATAGG AGCAACACTG ATACTGTCGA TGACTGGACA 120 

GGAACAAAGC TTGTGATTGT TTTGTGTGTT GGGACGTTTT TCTGCCTGTT TATTTTTTTT 180 

TCTAATTCTC TGGTCATCGC GGCAGTGATC AAAAACAGAA AATTTCATTT CCCCTTCTAC 240 

TACCTGTTGG CTAATTTAGC TGCTGCCGAT TTCTTCGCTG GAATTGCCTA TGTATTCCTG 300 

ATGTTTAACA CAGGCCCAGT TTCAAAAACT TTGACTGTCA ACCGCTGGTT TCTCCGTCAG 360 

GGGCTTCTGG ACAGTAGCTT GACTGCTTCC CTCACCAACT TGCTGGTTAT CGCCGTGGAG 420 

AGGCACATGT CAATCATGAG GATGCGGGTC CATAGCAACC TGACCAAAAA GAGGGTGACA 480 

CTGCTCATTT TGCTTGTCTG GGCCATCGCC ATTTTTATGG GGGCGGTCCC CACACTGGGC 540 

TGGAATTGCC TCTGCAACAT CTCTGCCTGC TCTTCCCTGG CCCCCATTTA CAGCAGGAGT 600 

TACCTTGTTT TCTGGACAGT GTCCAACCTC ATGGCCTTCC TCATCATGGT TGTGGTGTAC 660 

CTGCGGATCT ACGTGTACGT CAAGAGGAAA ACCAACGTCT TGTCTCCGCA TACAAGTGGG 720 

TCCATCAGCC GCCGGAGGAC ACCCATGAAG CTAATGAAGA CGGTGATGAC TGTCTTAGGG 780 

GCGTTTGTGG TATGCTGGAC CCCGGGCCTG GTGGTTCTGC TCCTCGACGG CCTGAACTGC 840 

AGGCAGTGTG GCGTGCAGCA TGTGAAAAGG TGGTTCCTGC TGCTGGCGCT GCTCAACTCC 900 

GTCGTGAACC CCATCATCTA CTCCTACAAG GACGAGGACA TGTATGGCAC CATGAAGAAG 960 

ATGATCTGCT GCTTCTCTCA GGAGAACCCA GAGAGGCGTC CCTCTCGCAT CCCCTCCACA 1020 

GTCCTCAGCA GGAGTGACAC AGGCAGCCAG TACATAGAGG ATAGTATTAG CCAAGGTGCA 1080 

GTCTGCAATA AAAGCACTTC CTAAACTCTG GATGCCTCTC GGCCCACCCA GGTGATGACT 1140 
GTCTTAGG 



SEQ ID NO:200 OBIS Protein sequence: 

Protein Accession*: NP_036284 

1 11 21 31 41 51 

1 i i I I I 

MNECHYDKHM DFFYNRSNTD TVDDWTGTKL VIVLCVGTFF CLFIFFSNSL VIAAVIKNRK 60 

FHFPFYYLLA NLAAADFFAG IAYVFLMFNT GPVSKTLTVN RWFLRCGLLD SSLTASLTNL 12 0 

LVIAVERHMS IMRMRVHSNL TKKRVTLLIL LVWAIAIFMG AVPTLGWNCL CNISACSSLA 180 

FIYSRSYLVF WTVSNLMAFL IMWVYLRIY VYVKRKTNVL SPHTSGSISR RRTPMKLMKT 240 

VMTVLGAFW CWTPGLWLL LDGLNCRQCG VQHVKRWFLL LALLNSWNP IIYSYKDEDM 300 
YGTMKKMICC FSQENPERRP SRIPSTVLSR SDTGSQYIED SISQGAVCNK STS 

SEQ ID NO:201 PAA6 DNA SEQUENCE 

Nucieic Acid Accession #: AA569531 

Coding sequence: 1 -504 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

) ! I I I ! 

ATGACCTACA GTTACTCATT TTTCAGGCCT GAGTTGATCG TTAATCATCT TAATTATGTT 60 

CATTCTGAAG CCAACAGGAG AACCAAGACC AAAACTTTAT TGTCTCTGCT TTCATTTCTT 120 

GATGAAACCT CTGGACTAAG CACACATCTT CCTTGTTTAT CTCTCTCAAA GGAGTGTGGA 180 

GTGCTTCATC TGGACATCCA CGGGAAGAAG GAAGACATGA GAATCACCCA ACAGTCTTCC 240 

CAGCTATACC TGTGGGACAT GGGTGGTTTT ACAATATTTA AGAACCTGTG GATGAGCCTC 300 

ATACCCAGAG GGAACAAACG CTCCCCAAAA AGAGTTACAG AAACCATCCT GAGAGATTTT 360 

AAGCAGAAGC AAAGTTCAAA GATCCAAGAG GAGAGACGAA GAGAGTCTGC AGGACCAAAC 420 

CTCTCTTCAT TCTGGTTTGT GGGGAATGCT GGAAGAGGAG ACAGGCCCCA GATTTGGGCA 480 

GGAAGTAAAC AGTTTTCAGG CTGAGGCCAA TCTGAGCAGG AACATTCCAA TATTTCTTCA 540 

GCTACGTTGT CCCAGCACTT CACTGGTTAA CCTTTTATGT CCACCATTTG TGG ATTTC AC 600 

AGCTACTTGT CAATGGTGAA TATTGATCAT CATCATTATC TACTGAGCTG CTACCATATC 660 

CCAGCTACTC CTTGCATGTT GTTCATTATT TTCTCAACAC TCAGCATATT TGCAATATGT 720 

TATGTAATAT CACAGACAAG GAAACTGAAC GC AG AAATGT TTTATTTCTT GCCAAACATC 780 

ACATGAGGAT GAACAATGAA ACCGATTTGA AACCAGGATT GTCTGATTCC AACATCTCTG 840 
GGTCCTTTTT CACTCTGATA TGC TGCAATT AAAAAGCCAT TTCTAAGACT GT 



SEQ ID NO:202 PAA6 Protein sequence: 
Protein Accession #: none found 

1 11 21 31 41 51 

i i I I I I 

MTYSYSFFRP ELIVNHIjKYV HSEANRRTKT KTLLSLLSFL DETSGLSTHL PCLSLSKECG 60 
VLHLDIHGKK EDMRITQQSS QLYLWDMGGF TIFKNLWMSL IPRGNKRSPK RVTETILRDF 120 
KQKQSSK1QE ERRRESAGPN LSSFWFVGNA GRGDRPQIWA GSKQFSG 
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Nucleic Acid Accession #: 
Coding sequence: 

i 11 



SEQ ID NO:203 PAB2 DNA SEQUENCE 

XMJJ5G197 

310-1971 (underlined sequences correspond to start and stop codecs) 



21 



31 



41 



51 



TCACACGTGC 
AGCCGCGCGC 
GCAGCAGGTG 
GGCGCCTGGC 
AGCAGAGCCG 
TGGCCCACTA_ 
CTCTTGCTGG 
TATGTGCCGC 
GGCATTGGTC 
TGGCGTGGAC 
CTGAGCCTCT 
AGGCCCCTGG 
GTGTGCTTCA 
CGCCAGGCCT 
CTGCCTGCCA 
TGCCTCTTTG 
GCTGAGGAGG 
TCGCCCCACT 
CCCCGGCTGC 
GAGCTGTGCA 
GAGGGGCTGT 
GATGAAGGCG 
TTCTCTCTGG 
AGTGTGGCAG 
GTGACAGCTT 
ACACTGGCCT 
ACTGGAGGTG 
GGAGCTCCCT 
CCCGCGCTCT 
ACCGAGGCCA 
GCCTTCCTGC 
CAGTCTGTCA 
GCTACACAGG 
AGCACATTGG 
ATGGGGCTGC 
GCCACCCTGT 
CTCTCCCCAG 
TTATACAGGG 
ACCCAGGCTC 
GGGAGCTGAA 
CGTTTAATGT 
ACATATGAAA 
CCTCAGCCCC 
TT 



CAAGGGGCTG 
CTCGGCCAGG 
TTGAGCATGG 
TGATTCCTAG 
AGACGAAGCA 
TGGTCCAGAG 
TCAACCTGCT 
CTCTGCTGCT 
CAGTGCTGGG 
GCTATGGCCG 
TTCTCATCCC 
AGCTGGCACT 
CTCCACTGGA 
ACTCTGTCTA 
TTGACTGGGA 
GCCTGCTCAC 
CAGCGCTGGG 
GCTGTCCATG 
ACCAGCTGTG 
GCTGGATGGC 
ACCAGGGCGT 
TTCGGATGGG 
TCATGGACCG 
CTTTCCCTGT 
CAGCCGCCCT 
CCCTCTACCA 
CTAGCAGTGA 
TCCCTAATGG 
GCGGGGCCTC 
GGGTGGTTCC 
TGTCCCAGGT 
CTGCCTATAT 
TAGTATTTGA 
GGTGGAGGGC 
CGGGCTGGCC 
GCTGCTGAGG 
TCTCTAGGGC 
AGGCCAGAAG 
AGGGTTAACA 
TAAACTCAGT 
AGCTCTTGCA 
GTTATTTGTA 
ACAGGCACTG 



GCTCAGCGGA 
ATCTGAGTGA 
GCTGAGAAGC 
GCAGTTGGCG 
GTTCTGGAGT 
GCTGTGGGTG 
AACCTTTGGC 
GGAAGTGGGG 
CCTGGTCTGT 
CCGCCGGCCC 
AAGGGCCGGC 
GCTCATCCTG 
GGCCCTGCTC 
TGCCTTCATG 
CACCAGTGCC 
CCTCATCTTC 
CCCCACCGAG 
CCGGGCCCGC 
CTGCCGCATG 
ACTCATGACC 
GCCCAGAGCT 
CAGCCTGGGG 
GCTGGTGCAG 
GGCTGCCGGT 
CACCGGGTTC 
CCGGGAGAAG 
GGACAGCCTG 
ACACGTGGGT 
TGCCTGTGAT 
GGGCCGGGGC 
GGCCCCATCC 
GGTGTCTGCC 
CAAGAGCGAC 
CTGCCTCACT 
GCCAGTTTCT 
TGCGTAGCTG 
TGCCTGACTG 
GGCTCCATGC 
GCTAGCCTCC 
CACCTGGTTT 
TGGGAGTTTC 
GGGGAAGAGT 
GTCTTTTTTG 



ACCAGCCTGC 
TGAGACGTGT 
TGGACCGGCA 
GCAGCAAGGA 
GCCTGAACGG 
AGCCGCCTGC 
CTGGAGGTGT 
GTAGAGGAGA 
GTCCCGCTCC 
TTCATCTGGG 
TGGCTAGCAG 
GGCGTGGGGC 
TCTGACCTCT 
ATCAGTCTTG 
CTGGCCCCCT 
CTCACCTGCG 
CCAGCAGAAG 
TTGGCTTTCC 
CCCCGCACCC 
TTCACGCTGT 
GAGCCGGGCA 
CTGTTCCTGC 
CGATTCGGCA 
GCCACATGCC 
ACCTTCTCAG 
CAGGTGTTCC 
ATGACCAGCT 
GCTGGAGGCA 
GTCTCCGTAC 
ATCTGCCTGG 
CTGTTTATGG 
GCAGGCCTGG 
TTGGCCAAAT 
GGGTCCCAGC 
GTTGCTGCCA 
CACAGCTGGG 
GAGGCCTTCC 
ACTGGAATGC 
TAGTTGAGAC 
CCCATCTCTA 
TAGGATGAAA 
CCTGAGGGGC 
CTNGANTCCA 



ACGCGCTGGC 
CCCCACTGAG 
CCAAAGGGCT 
GGAGAGGCCG 
CCCCCTGAGC 
TGCGGCACCG 
GTTTGGCCGC 
AGTTCATGAC 
TAGGCTCAGC 
CACTGTCCTT 
GGCTGCTGTG 
TGCrGGACTT 
TCCGGGACCC 
GGGGCTGCCT 
ACCTGGGCAC 
TAGCAGCCAC 
GGCTGTCGGC 
GGAACCTGGG 
TGCGCCGGCT 
TTTACACGGA 
CCGAGGCCCG 
AGTGCGCCAT 
CTCGAGCAGT 
TGTCCCACAG 
CCCTGCAGAT 
TGCCCAAATA 
TCCTGCCAGG 
GTGGCCTGCT 
GTGTGGTGGT 
ACCTCGCCAT 
GCTCCATTGT 
GTCTGGTCGC 
ACTCAGCGTA 
TCCCCGCTCC 
AAGTAATGTG 
GGCTGGGGCG 
AAGGGGGTTT 
GGGGACTCTG 
ACACCTAGAG 
AGCCCCTTAA 
CACTCCTCCA 
AACACACAAG 
CCCCCCCCCT 



TCCGGGTGAC 
GTGCCCCACA 
GGCAGAAATG 
CAGCTTCTGG 
CCTACCCGCC 
GAAAGCCCAG 
AGGCATCACC 
CATGGTGCTG 
CAGTGACCAC 
GGGCATCCTG 
CCCGGATCCC 
CTGTGGCCAG 
GGACCACTGT 
GGGCTACCTC 
CCAGGAGGAG 
ACTGCTGGTG 
CCCCTCCTTG 
CGCCCTGCTT 
CTTCGTGGCT 
TTTCGTGGGC 
GAGACACTAT 
CTCCCTGGTC 
CTATTTGGCC 
TGTGGCCGTG 
CCTQCCCTAC 
CCGAGGGGAC 
CCCTAAGCCT 
CCCACCTCCA 
GGGTGAGCCC 
CCTGGATAGT 
CCAGCTCAGC 
CATTTACTTT 
GAAAACTTCC 
TGTTAGCCCC 
GCTCTCTGCT 
TCCCTCTCCT 
CAGTCTGGAC 
CAGGTGGATT 
AAGGGTTTTT 
CCTGCAGCTT 
TGGGATTTGA 
AACCAGGTCC 
CTTTACCCTT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 



SEQ ID NO:204 PAB2 Protein sequence: 
Protein Accession #: XP.050197 



1 

I 

HVQRLWVSRL 
PVLGLVCVPL 
ELALLIIiGVG 
IDWDTSALAP 
CCPCRARLAF 
YQGVPRAEPG 
AFPVAAGATC 
ASSEDSLMTS 
RWPGRGICL 
WFDKSDLAK 



11 

i 

LRHRKAQLLL 
LGSASDHWRG 
LLDFCGCVCF 
YLGTQEECLF 
RNLGALLPRL 
TEARRHYDEG 
LSHSVAWTA 
FLPGPKPGAP 
DLAILDSAFL 
YSA 



21 
I 

VNLLTFGLEV 
RYGRRRPFIW 
TPLEALLSDL 
GLLTLIFLTC 
HQLCCRMFRT 
VRMGSLGLFL 
SAALTGFTFS 
FPNGHVGAGG 
LSQVAPSLFM 



31 
1 

CLAAGITYVP 
ALSLGILLSL 
FRDPDHCRQA 
VAATLLVAEE 
LRRLFVAELC 
QCAISLVFSL 
ALQILPYTLA 
SGLLPPPPAL 
GSIVQLSQSV 



41 

! 

P&LLEVGVEE 
FLIPRAGWLA 
YSVYAFMISL 
AALGPTEPAE 
SWMALMTFTL 
VMDRLVQRFG 
SLYHREKQVF 
CGASACDVSV 
TAYMVSAAGL 



51 

I 

KFMTMVLGIG 
GLLCPDPRPL 
GGCLGYLLPA 
GLSAPSLSPH 
FYTDFVGEGL 
TRAVYLASVA 
LPKYRGDTGG 
RVWGEPTEA 
GLVAIYFATQ 



60 
120 
180 
240 
300 
360 
420 
480 
540 



SEQ ID NO:205 PAJ3 DNA SEQUENCE 

Nucleic Acid Accession*: AK002126 

Coding sequence: 1-1593 (underlined sequences correspond to start and stop codons) 



11 



21 



31 



41 



51 



ATGGTTCGCC GGGGGCTGCT TGCGTGGATT TCCCGGGTGG TGGTTTTGCT GGTGCTCCTC 60 

TGCTGTGCTA TCTCTGTCCT GTACATGTTG GCCTGCACCC CAAAAGGTGA CGAGGAGCAG 120 

CTGGCACTGC CCAGGGCCAA CAGCCCCACG GGGAAGGAGG GGTACCAGGC CGTCCTTCAG 180 

GAGTGGGAGG AGCAGCACCG CAACTACGTG AGCAGCCTGA AGCGGCAGAT CGCACAGCTC 240 
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AAGGAGGAGC TGCAGGAGAC GAGTGAGCAG CTCAGGAATG GGCAGTACCA AGCCAGCGAT 300 

GCTGCTGGCC TGGGTCTGGA CAGGAGCCCC CCAGAGAAAA CCCAGGCCGA CCTCCTGGCC 360 

TTCCTGCACT CGCAGGTGGA CAAGGCAGAG GTGAATGCTG GCGTCAAGCT GGCCACAGAG 420 

TATGCAGCAG TGCCTTTCGA T AGC TTT ACT CTACAGAAGG TGTACCAGCT GGAGACTGGC 480 

5 CTTACCCGCC ACCCCGAGGA GAAGCCTGTG AGGAAGGACA AGCGGGATGA GTTGGTGGAA 540 

GCCATTGAAT CAGCCTTGGA GACCCTGAAC AATCCTGCAG AGAACAGCCC CAATCACCGT 600 

CCTTACACGG CCTCTGATTT CATAGAAGGG ATCTACCGAA CAGAAAGGGA CAAAGGGACA 660 

TTGTATGAGC TCACCTTCAA AGGGGACCAC AAACACGAAT TCAAACGGCT CATCTTATTT 720 

CGACCATTCG GCCCCATCAT GAAAGTGAAA AATGAAAAGC TCAACATGGC CAACACGCTT 7 80 

10 ATCAATGTTA TCGTGCCTCT AGCAAAAAGG GTGGACAAGT TCCGGCAGTT CATGCAGAAT 840 

TTCAGGGAGA TGTGCATTGA GCAGGATGGG AGAGTCCATC TCACTGTTGT TTACTTTGGG 900 

AAAGAAGAAA TAAATGAAGT CAAAGGAATA CTTGAAAACA CTTCCAAAGC TGCCAACTTC 960 

AGGAACTTTA CCTTCATCCA GCTGAATGGA GAATTTTCTC GGGGAAAGGG ACTTGATGTT 1020 

r GGAGCCCGCT TCTGGAAGGG AAGCAACGTC CTTCTCTTTT TCTGTGATGT GGACATCTAC 1080 

15 TTCACATCTG AATTCCTCAA TACGTGTAGG CTGAATACAC AGCCAGGGAA GAAGGTATTT 1140 

TATCCAGTTC TTTTCAGTCA GTACAATCCT GGCATAATAT ACGGCCACCA TGATGCAGTC 1200 

CCTCCCTTGG AACAGCAGCT GGTCATAAAG AAGGAAACTG GATTTTGGAG AGACTTTGGA 1260 

TTTGGGATGA CGTGTCAGTA TCGGTCAGAC TTCATCAATA TAGGTGGGTT TGATCTGGAC 1320 

^ A ATCAAAGGCT GGGGCGGAGA GGATGTGCAC CTTTATCGCA AGTATCTCCA CAGCAACCTC 1380 

20 ATAGTGGTAC GGACGCCTGT GCQAGGACTC TTCCACCTCT GGCATGAGAA GCGCTGCATG 1440 

GACGAGCTGA CCCCCGAGCA GTACAAGATG TGCATGCAGT CCAAGGCCAT GAACGAGGCA 1500 

TCCCACGGCC AGCTGGGCAT GCTGGTGTTC AGGCACGAGA TAGAGGCTCA CCTTCGCAAA 1560 
CAGAAACAGA AGACAAGTAG CAAAAAAACA TGA 



25 



40 
45 
50 
55 
60 
65 
70 
75 
80 



SEQ ID NO: 2 06 PAJ3 Protein sequence: 
Protein Accession #: NPJJ60841 



„ 1 11 21 31 41 51 

30 1 t I I I I 

MVRRGLLAWI SRVWLLVLL CCAISVLYML ACTPKGDEEQ LALPRANSPT GKEGYQAVLQ 60 

EWEEQHRNYV SSLKRQIAQL KEELQERSEQ LRNGQYQASD AAGLGLDRSP PEKTQADLLA 120 

FLHSQVDKAE VNAGVKLATE YAAVPFDSFT LQKVYQLETG LTRHPEEKPV RKDKRDELVE 180 

- AIESALETLN NPAENS PNHR PYTASDFIEG IYRTERDKGT LYELTFKGDH KHEFKRLILF 240 

35 RPFGPIMKVK NEKLNMANTL INVIVPLAKR VDKFRQFMQN FREMCIEQDG RVHLTWYFG 300 

KEEINEVKGI LENTSKAANF RNFTFIQLNG EFSRGKGLDV GARFWKGSNV LLFFCDVDIY 360 

FTSEFLNTCR LNTQPGKKVF YFVLFSQYNP GIIYGHHDAV PPLEQQLVIK KETGFWRDFG 420 

FGMTCQYRSD FINIGGFDLD IKGWGGEDVH LYRKYLHSNL IWRTPVRGL FHLWHEKRCM 480 
DELTPEQYKM CMQSKAMNEA SHGQLGMLVF RHEIEAHLRK QKQKTSSKKT 



SEQ ID NO:207 PAJ5 DNA SEQUENCE 

Nucleic Acid Accession #: AF1 89723 

Coding sequence: 1 -271 2 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I i i I ! i 

ATGATTCCTG TATTGACATC AAAAAAAGCA AGTGAATTAC CAGTCAGTGA AGTTGCAAGC 60 

ATTCTCCAAG CTGATCTTCA GAATGGTCTA AACAAATGTG AAGTTAGTCA TAGGCGAGCC 120 

TTTCATGGCT GGAATGAGTT TGATATTAGT GAAGATGAGC CACTGTGGAA GAAGTATATT 180 

TCTCAGTTTA AAAATCCCCT TATTATGCTG CTTCTGGCTT CTGCAGTCAT CAGTGTTTTA 240 

ATGCATCAGT TTGATGATGC CGTCAGTATC ACTGTGGCAA TACTTATCGT TGTTACAGTT 300 

GCCTTTGTTC AGGAATATCG TTCAGAAAAA TCTCTTGAAG AATTGAGTAA ACTTGTGCCA 360 

CCAGAATGCC ATTGTGTGCG TGAAGGAAAA TTGGAGCATA CACTTGCCCG AGACTTGGTT 420 

CCAGGTGATA CAGTTTGCCT TTCTGTTGGG GATAGAGTTC CTGCTGACTT ACGCTTGTTT 480 

GAGGCTGTGG ATCTTTCCAT TGATGAGTCC AGCTTGACAG GTGAGACAAC GCCTTGTTCT 540 

AAGGTGACAG CTCCTCAGCC AGCTGCAACT AATGGAGATC TTGCATCGAG AAGTAACATT 600 

GCCTTTATGG GAACACTGGT CAGATGTGGC AAAGCAAAGG GTGTTGTCAT TGGAACAGGA 660 

GAAAATTCTG AATTTGGGGA GGTTTTTAAA ATGATGCAAG CAGAAGAGGC ACCAAAAACC 720 

CCTCTGCAGA AGAGCATGGA CCTCTTAGGA AAACAACTTT CCTTTTACTC CTTTGGTATA 780 

ATAGGAATCA TCATGTTGGT TGGCTGGTTA CTGGGAAAAG ATATCCTGGA AATGTTTACT 840 

ATTAGTGTAA GTTTGGCTGT AGCAGCAATT CCTGAAGGTC TCCCCATTGT GGTCACAGTG 900 

ACGCTAGCTC TTGGTGTTAT GAGAATGGTG AAGAAAAGGG CCATTGTGAA AAAGCTGCCT 960 

ATTGTTGAAA CTCTGGGCTG CTGTAATGTG ATTTGTTCAG ATAAAACTGG AACACTGACG 1020 

AAGAATGAAA TGACTGTTAC TCACATATTT ACTTCAGATG GTCTGCATGC TGAGGTTACT 1080 

GGAGTTGGCT ATAATCAATT TGGGGAAGTG ATTGTTGATG GTGATGTTGT TCATGGATTC 1140 

TATAACCCAG CTGTTAGCAG AATTGTTGAG GCGGGCTGTG TGTGCAATGA TGCTGTAATT 1200 

AGAAACAATA CTCTAATGGG GAAGCCAACA GAAGGGGCCT TAATTGCTCT TGCAATGAAG 1260 

ATGGGTC TTG ATGGACTTCA ACAAGACTAC ATC AG AAAAG CTGAATACCC TTTTAGCTCT 1320 

GAGCAAAAGT GGATGGCTGT TAAGTGTGTA CACCGAACAC AGCAGGACAG ACCAGAGATT 1380 

TGTTTTATGA AAGGTGCTTA CGAACAAGTA ATTAAGTACT GTACTACATA CCAGAGCAAA 1440 

GGGCAGACCT TGACACTTAC TCAGCAGCAG AGAGATGTGT ACCAACAAGA GAAGGCACGC 1500 

ATGGGCTCAG CGGGACTCAG AGTTCTTGCT TTGGCTTCTG GTCCTGAACT GGGACAGCTG 1560 

ACATTTCTTG GCTTGGTGGG AATCATTGAT CCACCTAGAA CTGGTGTGAA AGAAGCTGTT 1620 

ACAACACTCA TTGCCTCAGG AGTATCAATA AAAATGATTA CTGGAGATTC ACAGGAGACT 1680 

GCAGTTGCAA TCGCCAGTCG TCTGGGATTG TATTCCAAAA CTTCCCAGTC AGTCTCAGGA 1740 

GAAGAAATAG ATGCAATGGA TGTTCAGCAG CTTTCACAAA TAGTACCAAA GGTTGCAGTA 1800 

TTTTACAGAG CTAGCCCAAG GCACAAGATG AAAATTATTA AGTCGCTACA GAAGAACGGT 1860 

TCAGTTGTAG CCATGACAGG AGATGGAGTA AATGATGCAG TTGCTCTGAA GGCTGCAGAC 1920 



387 



ATTGGAGTTG CGATGGGCCA GACTGGTACA GATGTTTGCA AAGAGGCAGC AGACATGATC 1980 

CTAGTGGATG ATGATTTTCA AACCATAATG TCTGCAATCG AAGAGGGTAA AGGGATTTAT 2040 

AATAACATTA AAAATTTCGT TAGATTCCAG CTGAGCACGA GTATAGCAGC ATTAACTTTA 2100 

ATCTCATTGG CTACATTAAT GAACTTTCCT AATCCTCTCA ATGCCATGCA GATTTTGTGG 2160 

ATCAATATTA TTATGGATGG ACCCCCAGCT CAGAGCCTTG GAGTAGAACC AGTGGATAAA 2220 

GATGTCATTC GTAAACCTCC TCGCAACTGG AAAGACAGCA TTTTGACTAA AAACTTGATA 2280 

CTTAAAATAC TTGTTTCATC AATAATCATT GTTTGTGGGA CTTTGTTTGT CTTCTGGCGT 2340 

GAGCTACGAG ACAATGTGAT TACACCTCGA GACACAACAA TGACCTTCAC ATGCTTTGTG 2400 

TTTTTTGACA TGTTCAATGC ACTAAGTTCC AGATCCCAGA CCAAGTCTGT GTTTGAGATT 2460 

GGACTCTGCA GTAATAGAAT GTTTTGCTAT GCAGTTCTTG GATCCATCAT GGGACAATTA 2520 

CTAGTTATTT ACTTTCCTCC GCTTCAGAAG GTTTTTCAGA CTGAGAGCCT AAGCATACTG 2580 

GATCTGTTGT TTCTTTTGGG TCTCACCTCA TCAGTGTGCA TAGTGGCAGA AATTATAAAG 2640 

AAGGTTGAAA GGAGCAGGGA AAAGATCCAG AAGCATGTTA GTTCGACATC ATCATCTTTT 2700 
CTTGAAGT AT GA 

SEQ ID NO:208 PAJ5 Protein seouence: 
Protein Accession #: AAF2781 3 

1 11 21 31 41 51 

I t I I I 1 

MIPVLTSKKA SELPVSEVAS ILQADLQNGL NKCEVSHRRA FHGWNEFDIS EDEFLWKKYI 60 

SQFKNPLIML LLASAVISVL MHQFDDAVSI TVAILIWTV AFVQEYRSEK SLEELSKLVP 120 

PECHCVREGK LEHTLARDLV PGDTVCLSVG DRVPADLRLF EAVDLS IDES SLTGETTPCS 180 

KVTAPQPAAT NGDLASRSNT AFMGTLVRCG KAKGWIGTG ENSEFGEVFK MMQAEEAPKT 240 

PLQKSMDLLG KQLSFYSFGI IGIIMLVGWL LGKDILEMFT ISVSLAVAAI PEGLPIWTV 300 

TLALGVMRMV KKRAIVKKLP IVETLGCCNV ICSDKTGTLT KNEMTVTHIF TSDGLHAEVT 360 

GVGYNQFGEV IVDGDWHGF YNPAVSRIVE AGCVCNDAVI RNNTLMGKPT EGALIALAMK 420 

MGLDGLQQDY IRKAEYPFSS EQKWMAVKCV HRTQQDRFEI CFMKGAYEQV IKYCTTYQSK 480 

GQTLTLTQQQ RDVYQQEKAR MGSAGLRVLA LASGPELGQL TFLGLVGIID PPRTGVKEAV 540 

TTLIASGVS1 KMITGDSQET AVAI ASRLG L YSKTSQSVSG EEIDAMDVQQ LSQIVPKVAV 600 

FYRASPRHKM KIIKSLQKNG SWAMTGDGV NDAVALKAAD IGVAMGQTGT DVCKEAADM1 660 

LVDDDFQTIM SAIEEGKGIY NNIKNFVRFQ LSTSIAALTL ISLATLMNFP NPLNAMQILW 720 

INIIMDGPPA QSLGVEFVDK DVIRKPPRNW KDSILTKNLI LKILVSSIIX VCGTLFVFWR 780 

ELRDNVITPR DTTMTFTCFV FFDMFNALSS RSQTKSVFEI GLCSNRMFCY AVLGSIMGQL 840 

LVIYFPPLQK VFQTESLSIL DLLFLLGLTS SVCIVAEIIK KVERSREKIQ KHVSSTSSSF 900 
LEV 

SEQ ID NO:209 PAV4 VARIANT 1 DNA SEQUENCE 

Nucleic Acid Accession #: N62096 

Coding sequence: 1-1284 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I I i 

ATGG GCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGA GAGGATTGCC TTATTCAATG 60 

AAGCAAGCTG GGTTTCCTTT GGGAATATTG CTTTTATTCT GGGTTTCATA TGTTACAGAC 120 

TTTTCCCTTG TTTTATTGAT AAAAGGAGGG GCCCTCTCTG GAACAGATAC CTACCAGTCT 180 

TTGGTCAATA AAACTTTCGG C TTTCCAGGG TATCTGCTCC TCTCTGTTCT TCAGTTTTTG 240 

TATCCTTTTA TAGCAATGAT AAGTTACAAT ATAATAGCTG GAGATACTTT GAGCAAAGTT 300 

TTTCAAAGAA TCCCAGGAGT TGATCCTGAA AACGTGTTTA TTGGTCGCCA CTTCATTATT 360 

GGACTTTCCA CAGTTACCTT TACTCTGCCT TTATCCTTGT ACCGAAATAT AGCAAAGCTT 420 

GGAAAGGTCT CCCTCATCTC TACAGGTTTA ACAACTCTGA TTCTTGGAAT TGTAATGGCA 480 

AGGGCAATTT CACTGGGTCC ACACATACCA AAAACAGAAG ACGCTTGGGT ATTTGCAAAG 540 

CCCAATGCCA TTCAAGCGGT CGGGGTTATG TCTTTTGCAT TTATTTGCCA CCATAACTCC 600 

TTCTTAGTTT ACAGTTCTCT AGAAGAACCC ACAGTAGCTA AGTGGTCCCG CCTTATCCAT 660 

ATGTCCATCG TGATTTCTGT ATTTATCTGT ATATTCTTTG CTACATGTGG ATACTTGACA 720 

TTTACTGGCT TCACCCAAGG GG AC TT ATTT GAAAATTACT GCAGAAATGA TGACCTGGTA 7S0 

ACATTTGGAA GATTTTGTTA TGGTGTCACT GTCATTTTGA CATACCCTAT GGAATGCTTT 840 

GTGACAAGAG AGGTAATTGC CAATGTGTTT TTTGGTGGGA ATCTTTCATC GGTTTTCCAC 900 

ATTGTTGTAA CAGTGATGGT CATCACTGTA GCCACGCTTG TGTCATTGCT GATTGATTGC 960 

CTCGGGATAG TTCTAGAACT CAATGGTGTG CTCTGTGCAA CTCCCCTCAT TTTTATCATT 1020 

CCATCAGCCT GTTATCTGAA ACTGTCTGAA GAACCAAGGA CACACTCCGA TAAGATTATG 1080 

TCTTGTGTCA TGCTTCCCAT TGGTGCTGTG GTGATGGTTT TTGGATTCGT CATGGCTATT 1140 

ACAAATACTC AAGACTGCAC CCATGGGCAG GAAATGTTCT ACTGCTTTCC TGACAATTTC 1200 

TCTCTCACAA ATACCTCAGA GTCTCATGTT CAGCAGACAA CACAACTTTC TACTTTAAAT 1260 
ATTAGTATCT TTCAACTCGA GTAA 



SEQ ID N0:21Q PAV4 Variant 1 Protein sequence: 
Pratem Accession #: none found 

1 11 21 31 41 51 

I I I t i 1 

MGYQRQEPVI FPQRGLPYSM KQAGFPLGIL LLFWVSYVTD FSLVLLIKGG ALSGTDTYQS 60 
LVNKTFGFPG YLLLSVLQFL YPFIAMISYN IIAGDTLSKV FQRIPGVDPE NVFIGRHFII 120 
GLSTVTFTLP LSLYRNIAKL GKVSLISTGL TTLILGIVMA RAISLGPHIP KTEDAWVFAK 180 
PNAI QAVGVM SFAFICHHNS FLVYSSLEEP TVAKWSRLIH MSIVISVFIC IFFATCGYLT 240 
FTGFTQGDLF ENYCRNDDLV TFGRFCYGVT VILTYPMECF VTREVIANVF FGGNLSSVFH 300 
IWTVHVITV ATLVSLLIDC LGIVLELNGV LCATPLIFII PSACYLKLSE EPRTHSDKIM 360 
SCVMLPIGAV VHVFGFVKAI TNTODCTHGQ EMFYCFPDNF SLTNTSESHV QQTTQLSTLN 420 
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ISIFQLE 



SEQ 10 N0:211 PAV4 VARIANT 2 DNA SEQUENCE 

Nucleic Acid Accession #: N62096 

Coding sequence: 1-1203 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I ! 1 I I 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGT TTTCCCTTGT TTTATTGATA 60 

AAAGGAGGGG CCCTCTCTGG AACAGATACC TACCAGTCTT TGGTCAATAA AACTTTCGGC 120 

TTTCCAGGGT ATCTGCTCCT CTCTGTTCTT CAGTTTTTGT ATCCTTTTAT AGCAATGATA 180 

AG TTACAATA TAATAGCTGG AGATACTTTG AGCAAAGTTT TTCAAAGAAT CCCAGGAGTT 240 

GATCCTGAAA ACGTGTTTAT TGGTCGCCAC TTCATTATTG GACTTTCCAC AGTTACCTTT 300 

ACTCTGCCTT TATCCTTGTA CCGAAATATA GCAAAGCTTG GAAAGGTCTC CCTCATCTCT 360 

ACAGGTTTAA CAACTCTGAT TCTTGGAATT GTAATGGCAA GGGCAATTTC ACTGGGTCCA 420 

CACATACCAA AAACAGAAGA CGCTTGGGTA TTTGCAAAGC C C AATGCC AT TCAAGCGGTC 480 

GGGGTTATGT CTTTTGCATT TATTTGCCAC CATAACTCCT TCTTAGTTTA CAGTTCTCTA 540 

GAAGAACCCA CAGTAGCTAA GTGGTCCCGC CTTATCCATA TGTCCATCGT GATTTCTGTA 600 

TTTATCTGTA TATTCTTTGC TACATGTGGA TACTTGACAT TTACTGGCTT CACCCAAGGG 660 

GACTTATTTG AAAATTACTG CAGAAATGAT GACCTGGTAA CATTTGGAAG ATTTTGTTAT 720 

GGTGTCACTG TCATTTTGAC ATACCCTATG GAATGCTTTG TGACAAGAGA GGTAATTGCC 780 

AATGTGTTTT TTGGTGGGAA TCTTTCATCG GTTTTCCACA TTGTTGTAAC AGTGATGGTC 840 

ATCACTGTAG CCACGCTTGT GTCATTGCTG ATTGATTGCC TCGGGATAGT TCTAGAACTC 900 

AATGGTGTGC TCTGTGCAAC TCCCCTCATT TTTATCATTC CATCAGCCTG TTATCTGAAA 960 

CTGTCTGAAG AACCAAGGAC ACACTCCGAT AAGATTATGT CTTGTGTCAT GCTTCCCATT 1020 

GGTGCTGTGG TGATGGTTTT TGGATTCGTC ATGGCTATTA CAAATACTCA AGACTGCACC 1080 

CATGGGCAGG AAATGTTCTA CTGCTTTCCT GACAATTTCT CTCTCACAAA TACCTCAGAG 1140 

TCTCATGTTC AGCAGACAAC ACAACTTTCT ACTTTAAATA TTAGTATCTT TCAACTCGAG 1200 
TAA 



SEQ ID NO:212 PAV4 Vanant 2 Protein sequence: 
Protein Accession #: none found 



1 11 21 31 41 51 

f f I i I I 

MGYQRQEPVI PPQFSLVLLI KGGALSGTDT YQSLVNKTFG FPGYLLLSVL QFLYPFIAMI 60 

SYNIIAGDTL SKVFQRIPGV DPENVFIGRH FIIGLSTVTF TLPLSLYRNI AKLGKVSLIS 120 

TGLTTLILGI VMARAISLGP HIPKTEDAWV FAKPNAIQAV GVMSFAFICH HNSFLVYSSL 180 

EEPTVAKWSR LIHMSIVISV FICIFFATCG YLTFTGFTQG DLFENYCRND DLVTFGRFCY 240 

GVTVILTYPM ECFVTREVIA NVFFGGNLSS VFHIWTVMV ITVATLVSLL IDCLGIVLEL 300 

NGVLCATPLI FIIPSACYLK LSEEPRTHSD KIMSCVMLPI GAWMVFGFV MAITNTQDCT 360 
HGQEMFYCFP DNFSLTNTSE SHVQQTTQLS TLNISIFQLE 

SEQ ID NO:213 PAV4 VARIANT 3 DNA SEQUENCE 

Nucleic Acid Accession #: N62096 

Coding sequence: 1-1 140 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

\ I I I ! i 

ATGGGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGG TCAATAAAAC TTTCGGCTTT 60 

CCAGGGTATC TGCTCCTCTC TGTTCTTCAG TTTTTGTATC C TTTT ATAGC AATGATAAGT 120 

TACAATATAA TAGCTGGAGA TACTTTGAGC AAAGTTTTTC AAAGAATCCC AGGAGTTGAT 180 

CCTGAAAACG TGTTTATTGG TCGCCACTTC ATTATTGGAC TTTCCACAGT TACCTTTACT 240 

CTGCCTTTAT CCTTGTACCG AAATATAGCA AAGCTTGGAA AGGTCTCCCT CATCTCTACA 300 

GGTTTAACAA CTCTGATTCT TGGAATTGTA ATGGCAAGGG CAATTTCACT GGGTCCACAC 360 

ATACCAAAAA CAGAAGACGC TTGGGTATTT GCAAAGCCCA ATGCCATTCA AGCGGTCGGG 420 

GTTATGTCTT TTGCATTTAT TTGCCACCAT AACTCCTTCT TAGTTTACAG TTCTCTAGAA 480 

GAACCCACAG TAGCTAAGTG GTCCCGCCTT ATCCATATGT CCATCGTGAT TTCTGTATTT 540 

ATCTGTATAT TCTTTGCTAC ATGTGGATAC TTGACATTTA CTGGCTTCAC CCAAGGGGAC 600 

TTATTTGAAA ATTACTGCAG AAATGATGAC CTGGTAACAT TTGGAAGATT TTGTTATGGT 660 

GTCACTGTCA TTTTGACATA CCCTATGGAA TGCTTTGTGA CAAGAGAGGT AATTGCCAAT 720 

GTGTTTTTTG GTGGGAATCT TTCATCGGTT TTCCACATTG TTGTAACAGT GATGGTCATC 780 

ACTGTAGCCA CGCTTGTGTC ATTGCTGATT GATTGCCTCG GGATAGTTCT AGAACTCAAT 840 

GGTGTGCTCT GTGCAACTCC CCTCATTTTT ATCATTCCAT CAGCCTGTTA TCTGAAACTG 900 

TCTGAAGAAC CAAGGACACA CTCCGATAAG ATTATGTCTT GTGTCATGCT TCCCATTGGT 960 

GCTGTGGTGA TGGTTTTTGG ATTCGTCATG GCTATTACAA ATACTCAAGA CTGCACCCAT 1020 

GGGCAGGAAA TGTTCTACTG CTTTCCTGAC AATTTCTCTC TCACAAATAC CTCAGAGTCT 1080 
CATGTTCAGC AGACAACACA ACTTTCTACT TTAAATATTA GTATCTTTCA ACTCGAGTAA 



SEQ ID NO:214 PAV4 Variant 3 Protein sequence: 
Protein Accession #: none found 

1 11 21 31 41 51 

I I I [ i I 

389 



MGYQRQEPVI PPQVNKTFGF PGYLLLSVLQ FLYPFIAMIS YNIIAGDTLS KVFQRIPGVD 60 

PENVFIGRHF IIGLSTVTFT LPLSLYRNIA KLGKVSLIST GLTTLILGIV MARAISLGPH 120 

IPKTEDAWVF AKPNAIQAVG VMSFAF TCHH NSFLVYSSLE EPTVAKWSRL IHMSIVISVF 180 

ICIFFATCGY LTFTGFTQGD LFENYCRNDD LVTFGRFCYG VTVILTYPME CFVTREVIAN 240 

VFFGGNLSSV FH1WTVMVI TVATLVSLLX DCLGIVLELN GVLCATFLIF IIPSACYLKL 300 

SEEPRTHSDK IMSCVMLPIG AWMVFGFVM AITNTQDCTH GQEMFYCFPD NFSLTNTSES 360 
HVQQTTQLST LNISIFQLE 



SEQ ID N0:21S PAV4 VARIANT 4 DNA SEQUENCE: 

Nucleic Acid Accession #: N62096 

Coding sequence: 1-1389 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I ! ! 

ATG GGCTACC AGAGGCAGGA GCCTGTCATC CCGCCGCAGA GAGATTTAGA TGACAGAGAA 60 

ACCCTTGTTT CTGAACATGA GTATAAAGAG AAAACCTGTC AGTCTGCTGC TCTTTTTAAT 120 

GTTGTCAACT CGATTATAGG ATCTGGTATA ATAGGATTGC CTTATTCAAT GAAGCAAGCT 180 

GGGTTTCCTT TGGGAATATT GCTTTTATTC TGGGTTTCAT ATGTTACAGA CTTTTCCCTT 240 

GTTTTATTGA TAAAAGGAGG GGCCCTCTCT GGAACAGATA CCTACCAGTC TTTGGTCAAT 300 

AAAACTTTCG GCTTTCCAGG GTATCTGCTC CTCTCTGTTC TTCAGTTTTT GTATCCTTTT 360 

ATAGCAATGA TAAGTTACAA TATAATAGCT GGAGATACTT TGAGCAAAGT TTTTCAAAGA 420 

ATCCCAGGAG TTGATCCTGA AAACGTGTTT ATTGGTCGCC ACTTCATTAT TGGACTTTCC 480 

AC AGTTAC C T TTACTCTGCC TTTATCCTTG TACCGAAATA TAGCAAAGCT TGGAAAGGTC 540 

TCCCTCATCT CTACAGGTTT AACAACTCTG ATTCTTGGAA TTGTAATGGC AAGGGCAATT 600 

TCACTGGGTC CACACATACC AAAAACAGAA GACGCTTGGG TATTTGCAAA GCCCAATGCC 660 

ATTCAAGCGG TCGGGGTTAT GTCTTTTGCA TTTATTTGCC ACCATAACTC CTTCTTAGTT 720 

TACAGTTCTC TAGAAGAACC CACAGTAGCT AAGTGGTCCC GCCTTATCCA TATGTCCATC 780 

GTGATTTCTG TATTTATCTG TATATTCTTT GCTACATGTG GATACTTGAC ATTTACTGGC 840 

TTCACCCAAG GGGACTTATT TGAAAATTAC TGCAGAAATG ATGACCTGGT AACATTTGGA 900 

AGATTTTGTT ATGGTGTCAC TGTCATTTTG ACATACCCTA TGGAATGCTT TGTGACAAGA 960 

GAGGTAATTG CCAATGTGTT TTTTGGTGGG AATCTTTCAT CGGTTTTCCA CATTGTTGTA 1020 

ACAGTGATGG TCATCACTGT AGCCACGCTT GTGTCATTGC TGATTGATTG CCTCGGGATA 1080 

GTTCTAGAAC TCAATGGTGT GCTCTGTGCA ACTCCCCTCA TTTTTATCAT TCCATCAGCC 1140 

TGTTATCTGA AACTGTCTGA AGAACCAAGG ACACACTCCG ATAAGATTAT GTCTTGTGTC 1200 

ATGCTTCCCA TTGGTGCTGT GGTGATGGTT TTTGGATTCG TCATGGCTAT TACAAATACT 1260 

CAAGACTGCA CCCATGGGCA GGAAATGTTC TACTGCTTTC CTGACAATTT CTCTCTCACA 1320 

AATACCTCAG AGTCTCATGT TCAGCAGACA ACACAACTTT CTACTTTAAA TATTAGTATC 1380 
TTTCAATGA 



SEQ ID NO:21 6 PAV4 Variant 4 Protein seouence: 
Protein Accession #: none found 



1 11 21 31 41 51 

S I I I 1 I 

MGYQRQEPVI PPQRDLDDRE TLVSEHEYKE KTCQSAALFN WNSIIGSGt IGLPYSMKQA 60 

GFPLGILLLF WVSYVTDFSL VLLIKGGALS GTDTYQSLVN KTFGFPGYLL LSVLQFLYPF 120 

IAMISYNIIA GDTLSKVFQR IPGVDPENVF IGRHFIIGLS TVTFTLPLSL YRNIAKLGKV 130 

SLISTGLTTL ILGIVMARAI SLGPHIPKTE DAWVFAKPNA IQAVGVMSFA FICHHKSFLV 240 

YSSLEEPTVA KWSRLIHMSI VISVFICIFF ATCGYLTFTG FTQGDLFENY CRNDDLVTFG 300 

RFCYGVTVIL TYPMECFVTR EVIANVFFGG NLSSVFHIW TVMVITVATL VSLLIDCLGI 360 

VLELNGVLCA TPLIFIIPSA CYLKLSEEPR THSDKIMSCV MLPIGAWMV FGFVMAITNT 420 
QDCTHGQEMF YCFPDNFSLT NTSESHVQQT TQLSTLNISI FQ 

SEQ ID NO:217 PAV9 DNA SEQUENCE 

Nucleic Acid Accession #: NM_01 7636 

Coding sequence: 1 -3501 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

( I 1 f I i 

ATGGAGGATG CCTTCGGGGC AGCCGTGGTG ACCGTGTGGG ACAGCGATGC ACACACCACG 60 

GAGAAGCCCA CCGATGCCTA CGGAGAGCTG GACTTCACGG GGGCCGGCCG CAAGCACAGC 12 0 

AATTTCCTCC GGCTCTCTGA CCGAACGGAT CCAGCTGCAG TTTATAGTCT GGTCACACGC 180 

ACATGGGGCT TCCGTGCCCC GAACCTGGTG GTGTCAGTGC TGGGGGGATC GGGGGGCCCC 240 

GTCCTCCAGA CCTGGCTGCA GGACCTGCTG CGTCGTGGGC TGGTGCGGGC TGCCCAGAGC 300 

ACAGGAGCCT GGATTGTCAC TGGGGGTCTG CACACGGGCk TCGGCCGGCA TGTTGGTGTG 360 

GCTGTACGGG ACCATCAGAT GGCCAGCACT GGGGGCACCA AGGTGGTGGC CATGGGTGTG 420 

GCCCCCTGGG GTGTGGTCCG GAATAGAGAC ACCCTCATCA ACCCCAAGGG CTCGTTCCCT 480 

GCGAGGTACC GGTGGCGCGG TGACCCGGAG GACGGGGTCC AGTTTCCCCT GGACTACAAC 540 

TACTCGGCCT TCTTCCTGGT GGACGACGGC ACACACGGCT GCCTGGGGGG CGAGAACCGC 600 

TTCCGCTTGC GCCTGGAGTC CTACATCTCA CAGCAGAAGA CGGGCGTGGG AGGGACTGGA 660 

ATTGACATCC CTGTCCTGCT CCTCCTGATT GATGGTGATG AGAAGATGTT GACGCGAATA 720 

GAGAACGCCA CCCAGGCTCA GCTCCCATGT CTCCTCGTGG CTGGCTCAGG GGGAGCTGCG 780 

GACTGCCTGG CGGAGACCCT GGAAGACACT CTGGCCCCAG GGAGTGGGGG AGCCAGGCAA 840 

GGCGAAGCCC GAGATCGAAT CAGGCGTTTC TTTCCCAAAG GGGACCTTGA GGTCCTGCAG 900 

GCCCAGGTGG AGAGGATTAT GACCCGGAAG GAGCTCCTGA CAGTCTATTC TTCTGAGGAT 960 
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GGGTCTGAGG AATTCGAGAC CATAGTTTTG AAGGCCCTTG TGAAGGCCTG TGGGAGCTCG 1020 

GAGGCCTCAG CCTACCTGGA TGAGCTGCGT TTGGCTGTGG CTTGGAACCG CGTGGACATT 1080 

GCCCAGAGTG AACTCTTTCG GGGGGACATC CAATGGCGGT CCTTCCATCT CGAAGCTTCC 1140 

CTCATGGACG CCCTGCTGAA TGACCGGCCT GAGTTCGTGC GCTTGCTCAT TTCCCACGGC 1200 

CTCAGCCTGG GCCACTTCCT GACCCCGATG CGCCTGGCCC AACTCTACAG CGCGGCGCCC 1260 

TCCAACTCGC TCATCCGCAA CCTTTTGGAC CAGGCGTCCC ACAGCGCAGG CACCAAAGCC 1320 

CCAGCCCTAA AAGGGGGAGC TGCGGAGCTC CGGCCCCCTG ACGTGGGGCA TGTGCTGAGG 1380 

ATGCTGCTGG GGAAGATGTG CGCGCCGAGG TACCCCTCCG GGGGCGCCTG GGACCCTCAC 1440 

CCAGGCCAGG GCTTCGGGGA GAGCATGTAT CTGCTC TCGG ACAAGGCCAC CTCGCCGCTC 1500 

TCGCTGGATG CTGGCCTCGG GCAGGCCCCC TGGAGCGACC TGCTTCTTTG GGCACTGTTG 1560 

CTGAACAGGG CACAGATGGC CATGTACTTC TGGGAGATGG GTTCCAATGC AGTTTCCTCA 162 0 

GCTCTTGGGG CCTGTTTGCT GCTCCGGGTG ATGGCACGCC TGGAGCCTGA CGCTGAGGAG 1680 

GCAGCACGGA GGAAAGACCT GGCGTTCAAG TTTGAGGGGA TGGGCGTTGA CCTCTTTGGC 1740 

GAGTGCTATC GCAGCAGTGA GGTGAGGGCT GCCCGCCTCC TCCTCCGTCG CTGCCCGCTC 1800 

TGGGGGGATG CCACTTGCCT CCAGCTGGCC ATGCAAGCTG ACGCCCGTGC CTTCTTTGCC 1860 

CAGGATGGGG TACAGTCTCT GCTGACACAG AAGTGGTGGG GAGATATGGC CAGCACTACA 1920 

CCCATCTGGG CCCTGGTTCT CGCCTTCTTT TGCCCTCCAC TCATCTACAC CCGCCTCATC 1980 

ACCTTCAGGA AATCAGAAGA GGAGCCCACA CGGGAGGAGC TAGAGTTTGA CATGGATAGT 2040 

GTCATTAATG GGGAAGGGCC TGTCGGGACG GCGGACCCAG CCGAGAAGAC GCCGCTGGGG 2100 

GTCCCGCGCC AGTCGGGCCG TCCGGGTTGC TGCGGGGGCC GCTGCGGGGG GCGCCGGTGC 2160 

CTACGCCGCT GGTTCCACTT CTGGGGCGCG CCGGTGACCA TCTTCATGGG CAACGTGGTC 2220 

AGCTACCTGC TGTTCCTGCT GCTTTTCTCG CGGGTGCTGC TCGTGGATTT CCAGCCGGCG 2280 

CCGCCCGGCT CCCTGGAGCT GCTGCTCTAT TTCTGGGCTT TCACGCTGCT GTGCGAGGAA 2340 

CTGCGCCAGG GCCTGAGCGG AGGCGGGGGC AGCCTCGCCA GCGGGGGCCC CGGGCCTGGC 2400 

CATGCCTCAC TGAGCCAGCG CCTGCGCCTC TACCTCGCCG ACAGCTGGAA CCAGTGCGAC 2460 

CTAGTGGCTC TCACCTGCTT CCTCCTGGGC GTGGGCTGCC GGCTGACCCC GGGTTTGTAC 2520 

CACCTGGGCC GCACTGTCCT CTGCATCGAC TTCATGGTTT TCACGGTGCG GCTGCTTCAC 2580 

ATCTTCACGG TCAACAAACA GCTGGGGCCC AAGATCGTCA TCGTGAGCAA GATGATGAAG 2640 

GACGTGTTCT TCTTCCTCTT CTTCCTCGGC GTGTGGCTGG TAGCC TATGG CGTGGCCACG 2700 

GAGGGGCTCC TGAGGCCACG GGACAGTGAC TTCCCAAGTA TCCTGCGCCG CGTCTTCTAC 2760 

CGTCCCTACC TGCAGATCTT CGGGCAGATT CCCCAGGAGG ACATGGACGT GGCCCTCATG 2820 

GAGCACAGCA ACTGCTCGTC GGAGCCCGGC TTCTGGGCAC ACCCTCCTGG GGCCCAGGCG 2880 

GGCACCTGCG TCTCCCAGTA TGCCAACTGG CTGGTGGTGC TGCTCCTCGT CATCTTCCTG 2940 

CTCGTGGCCA ACATCCTGCT GGTCAACTTG CTCATTGCCA TGTTCAGTTA CACATTCGGC 3000 

AAAGTACAGG GCAACAGCGA TCTCTACTGG AAGGCGCAGC GTTACCGCCT CATCCGGGAA 3060 

TTCCACTCTC GGCCCGCGCT GGCCCCGCCC TTTATC GTC A TCTCCCACTT GCGCCTCCTG 3120 

CTCAGGCAAT TGTGCAGGCG ACCCCGGAGC CCCCAGCCGT CCTCCCCGGC CCTCGAGCAT 3180 

TTCCGGGTTT ACCTTTCTAA GGAAGCCGAG CGGAAGCTGC TAACGTGGGA ATCGGTGCAT 3240 

AAGGAGAACT TTCTGCTGGC ACGCGCTAGG GACAAGCGGG AGAGCGACTC CGAGCGTCTG 3300 

AAGCGCACGT CCCAGAAGGT GGACTTGGCA CTGAAACAGC TGGGACACAT CCGCGAGTAC 3360 

GAACAGCGCC TGAAAGTGCT GGAGCGGGAG GTCCAGCAGT GTAGCCGCGT CCTGGGGTGG 3420 

GTGGCCGAGG CCCTGAGCCG CTCTGCCTTG CTGCCCCCAG GTGGGCCGCC ACCCCCTGAC 3480 
CTGCCTGGGT CCAAAGACTG A 



SEQ ID WO.-218 PAV9 Protein sequence: 

Protein Accession #: none found 

1 11 21 31 41 51 

I ! I ! I I 

MEDAFGAAW TVWDSDAHTT EKPTDAYGEL DFTGAGRKHS NFLRLSDRTD PAAVYSLVTR 60 

TWGFRAPNLV VSVLGGSGGP VLOTWLQDLL RRGLVRAAQS TGAWIVTGGL HTGIGRHVGV 120 

AVRDHQMAST GGTKWAMGV APWGWRNRD TLINPKGSFP ARYRWRGDPE DGVQFPLDYN 180 

YSAFFLVDDG THGCLGGENR FRLRLESYIS QQKTGVGGTG IDIPVLLLLI DGDEKMLTRI 240 

ENATQAQLPC LLVAGSGGAA DCLAETLEDT LAPGSGGARQ GEARDRIRRF FPKGDLEVLQ 300 

AQVERIMTRK ELLTVYSSED GSEEFETIVL KALVKACGSS EASAYLDELR LAVAWNRVDI 360 

AQSELFRGDI QWRSFHLEAS LMDALLNDRP EFVRLLISHG LSLGHFLTPM RLAQLYSAAP 420 

SNSLIRNLLD QASHSAGTKA PALKGGAAEL RPPDVGHVLR MLLGKMCAPR YPSGGAWDPH 480 

PGQGFGESMY LLSDKATSPI* SLDAGLGQAP WSDI/LLWALL LNRAQMAMYF WEMGSNAVSS 540 

ALGACLLLRV MARLEPDAEE AARRKDLAFK FEGMGVDLFG ECYRSSEVRA ARLLLRRCPL 600 

WGDATCLQLA MQADARAFFA QDGVQSLLTQ KWWGDMASTT PIWALVLAFF CPPLIYTRLI 660 

TFRKSEEEPT REELEFDMDS VINGEGPVGT ADPAEKTPLG VPRQSGRPGC CGGRCGGRRC 720 

LRRWFHFWGA PVTIFMGNW SYLLFLLLFS RVLLVDFQPA PPGSLELLLY FWAFTLLCEE 780 

LRQGLSGGGG SLASGGPGPG HASLSQRLRL YLADSWNQCD LVALTCFLLG VGCRLTPGLY 840 

HLGRTVLCID FMVFTVRLLH IFTVNKQLGP KIVIVSKMMK DVFFFLFFLG VWLVAYGVAT 900 

EGLLRPRDSD FPSILRRVFY RPYLQIFGQI PQEDMDVALM EHSNCSSEPG FWAHPPGAQA 960 

GTCVSQYANW LWLLLVIFL LVANI LLVNL LIAMFSYTFG KVQGNSDLYW KAQRYRL IRE 1020 

FHSRPALAPP FIVISHLRLL LRQLCRRPRS PQPSSPALEH FRVYLSKEAE RKLLTWESVH 1080 

KENFLLARAR DKRESDSERL KRTSQKVDLA LKQLGHIREY EQRLKVLERE VQQCSRVLGW 1140 
VAEALSRSAL LPPGGPPPPD LPGSKD 

SEQ ID NO:219 PBF1 DNA SEQUENCE 

Nucleic Acid Accession #: AA054237 

Cooing sequence: 1 -894 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

! I I i I I 

ATGGAGCCGC GGGCGCTCGT CACGGCGCTC AGCCTCGGCC TCAGCCTGTG CTCCCTGGGG 60 

CTGCTCGTCA CGGCCATCTT CACCGACCAC TGGTACGAGA CCGACCCCCG GCGCCACAAG 120 

GAGAGCTGCG AGCGCAGCCG CGCGGGCGCC GACCCCCCGG ACCAGAAGAA CCGCCTGATG 180 
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CCGCTGTCGC 
GGCCCGGGGC 
GCCGAGTGCG 
CTGGGCATCG 
GCCATCAAGT 
AAGACCATAC 
CTCGGCATGG 
TGGGAGGAGA 
TGCACCATTT 
AAGCTAATTT 
GCCTGGTGCA 
ATTAGCCGGA 



ACCTGCCGCT 
GCGCCGACCC 
GCCGGCCCCT 
ACCGGGACAT 
ACCACTTTTC 
AGCAAGATGA 
CCGTAGCCGT 
GCTTGACCCA 
CCCTCTGTAC 
ATAGCCTGCC 
GTTTAGGCTT 
CCAAGATTGC 



GCGGGACTCG 
CGAGTCCTGG 
CTTCGCCACC 
CGACACCCTC 
TCAGCCCATC 
GTGGCACCTG 
CCTTCTCTGC 
GCACGTGGCT 
TTATGCCGCC 
TGCTGATGTG 
TATTGTGGCA 
ACAGCTAAAG 



CCCCCGCTGG 
CGCTCGCTCC 
TACTCGGGCC 
ATCCTGAAAG 
CGCTTGCGAA 
CTTCATTTAA 
GGCTGCATTG 
GGACTCCTGT 
AGTATCTCGT 
GAACATGGTT 
GCTGGAGGTC 
TCTGGCAGAG 



GGCGCCGGCT 
TGGGGCTCGG 
TCTGGAGGAA 
GTATTGCGCA 
ACATTCCTTT 
GAAGAATCAC 
TGGCCACAGT 
TCCTCATGAC 
ATGATTTGAA 
ACAGCTGGTC 
TCTGCATCGC 
ACTCCACGGT 



GCTCCCGGGC 
CGGGCTGGAC 
GTGCTACTTC 
GCGATGCACG 
TAATTTAACC 
TGCTGGCTTC 
CAGTTTCTTC 
AGGGATATTT 
CCGGCTCCCA 
CATCTTTTGC 
TTATCCGTTT 
ATGA 



240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 



15 



SEQ ID NO:22Q PBF1 Protein sequence: 
Protein Accession #: none found 



20 
25 
30 
35 
40 
45 
50 
55 
60 
65 



KEPRALVTAL 
PLSHLPLRDS 
LGIDRDIDTL 
LGMAVAVLLC 
KLIYSLPADV 



11 

I 

SLGLSLCSLG 
PPLGRRLLPG 
ILKGIAQRCT 
GCXVATVSFF 
EHGYSWSIFC 



21 



31 



41 



51 



Nucleic Acid Accession #: 
Coding sequence: 



LLVTAIFTDH WYETDPRRHK ESCERSRAGA DPPDQKNRLM 
GPGRADPESW RSLLGLGGLD AECGRPLFAT YSGLWRKCYF 
AIKYHFSQPI RLRNIPFNLT KTIQQDEWHL LHLRR ITAGF 
WEESl/TQHVA GLLFLMTGIF CTISLCTYAA SISYDLNRLP 
AWCSLGFIVA AGGLCIAYPF ISRTKIAOLK SGRDSTV 



SEQ ID NO:221 PCI4 DNA SEQUENCE 

NM_016570 

1- 11 34 (underlined sequences correspond to start and stop codons) 



ATGA GGCGAC 
AAGGTTCCTG 
TTTACAACTA 
AAGTATGAAT 
ACTGTTGCCA 
GTTGCATCTG 
AAAGAGTGGC 
CAAGATGTGA 
GATGATTCAT 
GTAGCAGGGA 
CATTTGGCAG 
TCTTTTGGAG 
ATAGATCACA 
TATAAAATAT 
CATGCTGCAG 
ATGGTGACAG 
ATTGTTGGAG 
GAAATAATTT 
GAGGATGGCC 



11 
I 

TGAATCGGAA 
AGAGCTATGT 
TGGCTTTATT 
ACGAAGTAGA 
TGAAGTGTCA 
CAGATGGTTT 
AGAGGATGCT 
TATTTAAAAG 
CACAGTCTCC 
ATTTTCACAT 
CACTTGTCAA 
AGCTTGTTCC 
ACCAGATGTT 
CAGCAGACAC 
GCAGCCATGG 
TTACTGAGGA 
GAATCTTTTC 
GCTGTCGTTT 
ACACAGACAA 



21 

i 

AAAAACTTTA 
AGAGACTTCA 
AACCATAATG 
CAAGGATTTT 
ATATGTTGGA 
AGTTTATGAA 
GCAGCTGATT 
TGCTTTTAAA 
AAATGCATGC 
AACAGTGGGC 
CCATGAATCT 
AGCAATTATT 
CCAATATTTT 
CCATCAGTTT 
AGTCTCTGGG 
GCACATGCCA 
AACAACAGGC 
CAGACTTGGA 
CCACTTACCT 



31 
I 

AGTTTGGTAA 
GCCAGTGGAG 
GAATTCTCAG 
TCTAGCAAAT 
GCGGATGTAT 
CCAACAGTAT 
CAGAGTAGGC 
AGTACATCAA 
AGAATTCATG 
AAGGCAATTC 
TACAATTTTT 
AATCCTTTAG 
ATTACAGTTG 
TCTGTGACAG 
ATATTTATGA 
TTCTGGCAGT 
ATGTTACATG 
TCCTATAAAC 
CTTTTAGAAA 



41 
I 

AAGAGTTGGA 
GTACAGTTTC 
TATATCAAGA 
TAAGAATTAA 
TGGATTTAGC 
TTGATCTTTC 
TACAAGAAGA 
CAGCTCTTCC 
GCCATCTATA 
CACATCCTCG 
CTCATAGAAT 
ATGGAACTGA 
TGCCAACAAA 
AAAGGGAACG 
AATATGATCT 
TTTTTGTAAG 
GAATTGGAAA 
CTGTCAATTC 
ATAATACACA 



51 
I 

TGCCTTTCCG 
TCTAATAGCA 
TACATGGATG 
TATAGATATT 
AGAAACAATG 
ACCACAGCAG 
GCATTCACTT 
ACCAAGAGAA 
TGTCAATAAA 
TGGTCATGCA 
AGATCATTTG 
AAAAATTGCT 
ACTACATACA 
TATCATTAAC 
CAGTTCTCTT 
AC TCTGTGGT 
ATTTATAGTT 
TGTTCCTTTT 
TTGA 



SEQ ID NO:222 PCI4 Protein sequence: 

Protein Accession #: NP__057654 



MRRLNRKKTL 
KYEYEVBKDF 
KEWQRMLQLI 
VAGNFHI1VG 
IDHNQMFQYF 
MVTVTEEHMP 
EDGHTDNHLP 



11 

I 

SLVKELDAFP 
SSKLRINIDI 
QSRLQEEHSL 
KA1PHPRGHA 
ITWPTKLHT 
FWQFFVRLCG 
LLENNTH 



21 

i 

KVPESYVETS 
TVAMKCQYVG 
QDVIFKSAFK 
HLAALVNHES 
YKISADTHQF 
IVGGIFSTTG 



31 
I 

ASGGTVSLIA 
ADVLDLAETM 
STSTALPPRE 
YNFSHRIDHL 
SVTERERIIN 
MLHGIGKFIV 



41 

! 

FTTMALLTIM 
VASADGLVYE 
DDSSQSPNAC 
SFGELVPAII 
HAAGSHGVSG 
EIICCRFRLG 



51 
I 

EFSVYQDTWM 
PTVFDLSPQQ 
RIHGHLYVNK 
NPLDGTEKIA 
IFMKYDLSSL 
SYKPVNSVPF 



60 
120 
180 
240 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



60 
120 
180 
240 
300 
360 



70 
75 
80 



Nucleic Acid Accession #: 
Coding sequence: 



SEQ ID NO:223 PEZ3 DNA SEQUENCE 

NM_001935.1 

76-2301 (underlined sequences correspond to start and stop codons) 



CGCGCGTCTC 
GAGGAGACGC 
GCGCTTGTCA 
ACAGCTGACA 
AAGTTATACT 



11 

i 

CGCCGCCCGC 
CGACGATGAA 
CCATCATCAC 
GTCGCAAAAC 
CCTTAAGATG 



21 
I 

GTGACTTCTG 
GACACCGTGG 
CGTGCCCGTG 
TTACACTCTA 
GATTTCAGAT 



31 

I 

CCTGCGCTCC 
AAGATTCTTC 
GTTCTGCTGA 
ACTGATTACT 
CATGAATATC 



41 



51 



TTCTCTGAAC GCTCACTTCC 60 

TGGGACTGCT GGGTGCTGCT 120 

ACAAAGGCAC AGATGATGCT 180 

TAAAAAATAC TTATAGACTG 240 

TCTACAAACA AGAAAATAAT 300 



392 



ATCTTGGTAT TCAATGCTGA ATATGGAAAC AGCTCAGTTT TCTTGGAGAA CAGTACATTT 360 

GATGAGTTTG GACATTCTAT CAATGATTAT TCAATATCTC CTGATGGGCA GTTTATTCTC 420 

TTAGAATACA ACTACGTGAA GCAATGGAGG CATTCCTACA CAGCTTCATA TGACATTTAT 480 

GATTTAAATA AAAGGCAGCT GATTACAGAA GAGAGGATTC CAAACAACAC ACAGTGGGTC 540 

ACATGGTCAC CAGTGGGTCA TAAATTGGCA TATGTTTGGA ACAATGACAT TTATGTTAAA 600 

ATTGAACCAA ATTTACCAAG TTACAGAATC ACATGGACGG GGAAAGAAGA TATAATATAT 660 

AATGGAATAA CTGACTGGGT TTATGAAGAG GAAGTCTTCA GTGCCTACTC TGCTCTGTGG 720 

TGGTCTCCAA ACGGCACTTT TTTAGCATAT GCCCAATTTA ACGACACAGA AGTCCCACTT 780 

ATTGAATACT CCTTCTACTC TGATGAGTCA CTGCAGTACC CAAAGACTGT ACGGGTTCCA 840 

TATCCAAAGG CAGGAGCTGT GAATCCAACT GTAAAGTTCT TTGTTGTAAA TACAGACTCT 900 

CTCAGCTCAG TCACCAATGC AACTTCCATA CAAATCACTG CTCCTGCTTC TATGTTGATA 960 

GGGGATCACT ACTTGTGTGA TGTGACATGG GCAACACAAG AAAGAATTTC TTTGCAGTGG 1020 

CTCAGGAGGA TTCAGAACTA TTCGGTCATG GATATTTGTG ACTATGATGA ATCCAGTGGA 1080 

AGATGGAACT GCTTAGTGGC ACGGCAACAC ATTGAAATGA GTACTACTGG CTGGGTTGGA 1140 

AGATTTAGGC CTTCAGAACC TCATTTTACC CTTGATGGTA ATAGCTTCTA CAAGATCATC 1200 

AGCAATGAAG AAGGTTACAG ACACATTTGC TATTTCCAAA TAGATAAAAA AGACTGCACA 1260 

TTTATTACAA AAGGCACCTG GGAAGTCATC GGGATAGAAG CTCTAACCAG TGATTATCTA 1320 

TACTACATTA GTAATGAATA TAAAGGAATG CCAGGAGGAA GGAATCTTTA TAAAATCCAA 1380 

CTTATTGACT ATACAAAAGT GACATGCCTC AGTTGTGAGC TGAATCCGGA AAGGTGTCAG 1440 

TACTATTCTG TGTCATTCAG TAAAGAGGCG AAGTATTATC AGCTGAGATG TTCCGGTCCT 1500 

GGTCTGCCCC TCTATACTCT ACACAGCAGC GTGAATGATA AAGGGCTGAG AGTCCTGGAA 1560 

GACAATTCAG CTTTGGATAA AATGCTGCAG AATGTCCAGA TGCCCTCCAA AAAACTGGAC 1620 

TTCATTATTT TGAATGAAAC AAAATTTTGG TATCAGATGA TCTTGCCTCC TCATTTTGAT 1680 

AAATCCAAGA AATATCCTCT ACTATTAGAT GTGTATGCAG GCCCATGTAG TCAAAAAGCA 1740 

GACACTGTCT TC AG AC TG AA CTGGGCCACT TACCTTGCAA GCACAGAAAA CATTATAGTA 1800 

GCTAGCTTTG ATGGCAGAGG AAGTGGTTAC CAAGGAGATA AGATCATGCA TGCAATCAAC 1860 

AGAAGACTGG GAACATTTGA AGTTGAAGAT CAAATTGAAG CAGCCAGACA ATTTTCAAAA 1920 

ATGGGATTTG TGGACAACAA ACGAATTGCA ATTTGGGGCT GGTCATATGG AGGGTACGTA 1980 

ACCTCAATGG TCCTGGGATC GGGAAGTGGC GTGTTCAAGT GTGGAATAGC CGTGGCGCCT 2040 

GTATCCCGGT GGGAGTACTA TGACTCAGTG TACACAGAAC GTTACATGGG TCTCCCAACT 2100 

CCAGAAGACA ACCTTGACCA TTACAGAAAT TCAACAGTCA TGAGCAGAGC TGAAAATTTT 2160 

AAACAAGTTG AGTACCTCCT TATTCATGGA ACAGCAGATG ATAACGTTCA CTTTCAGCAG 2220 

TCAGCTCAGA TCTCCAAAGC CCTGGTCGAT GTTGGAGTGG ATTTCCAGGC AATGTGGTAT 2280 

ACTGATGAAG ACCATGGA AT AG CTAGCAGC ACAGCACACC AACATATATA TACCCACATG 2340 

AGCCACTTCA TAAAACAATG TTTCTCTTTA CCTTAGCACC TCAAAATACC ATGCCATTTA 2400 

AAGCTTATTA AAACTCATTT TTGTTTTCAT TATCTCAAAA CTGCACTGTC AAGATGATGA 2460 

TGATCTTTAA AATACACACT CAAATCAAGA AACTTAAGGT TACCTTTGTT CCCAAATTTC 2520 

ATACCTATCA TCTTAAGTAG GGACTTCTGT CTTCACAACA GATTATTACC TTACAGAAGT 2580 

TTGAATTATC CGGTCGGGTT TTATTGTTTA AAATCATTTC TGCATCAGCT GCTGAAACAA 2640 

CAAATAGGAA TTGTTTTTAT GGAGGCTTTG CATAGATTCC CTGAGCAGGA TTTTAATCTT 2700 

TTTCTAACTG GACTGGTTCA AATGTTGTTC TCTTCTTTAA AGGGATGGCA AGATGTGGGC 2760 

AGTGATGTCA CTAGGGCAGG GACAGGATAA GAGGGATTAG GGAGAGAAGA TAGCAGGGCA 2820 

TGGCTGGGAA CCCAAGTCCA AGCATACCAA CACGAGCAGG CTACTGTCAG CTCCCCTCGG 2880 

AGAAGAGCTG TTCACCACGA GACTGGCACA GTTTTCTGAG AAAGACTATT CAAACAGTCT 2940 

CAGGAAATCA AATATCGAAA GCACTGACTT CTAAGTAAAC CACAGCAGTT GAAAGACTCC 3000 

AAAGAAATGT AAGGGAAACT GCCAGCAACG CAGCCCCCAG GTGCCAGTTA TGGCTATAGG 3060 

TGCTACAAAA ACACAGCAAG GGTGATGGGA AAGCATTGTA AATGTGCTTT TAAAAAAAAA 3120 

TACTGATGTT CCTAGTGAAA GAGGCAGCTT GAAACTGAGA TGTGAACACA TCAGCTTGCC 3180 

CTGTTAAAAG ATGAAAATAT TTGTATCACA AATCTTAACT TGAAGGAGTC CTTGCATCAA 3240 

TTTTTCTTAT TTCATTTCTT TGAGTGTCTT AATTAAAAGA ATATTTTAAC TTCCTTGGAC 3300 

TCATTTTAAA AAATGGAACA TAAAATACAA TGTTATGTAT TATTATTCCC ATTCTACATA 3360 
CTATGGAATT TCTCCCAGTC ATTTAATAAA TGTGCCTTCA TTTTTTC 



SEQ ID NO:224 PEZ3 Protein sequence: 

Protein Accession #: NPJW1926.1 

1 11 21 31 41 51 



MKTPWKILLG LLGAAALVTI ITVPWLLNK GTDDATADSR KTYTLTDYLK NTYRLKLYSL 60 

RWISDHEYLY KQENNILVFN AEYGNSSVFL ENSTFDEPGH SINDYSISPD GQFILLEYNY 120 

VKQWRHSYTA SYDIYDLNKR QLITEERIPN NTQWVTWSFV GHKLAYVWNN DIYVKIEPNL 180 

PSYRITWTGK EDIIYNGITD WVYEEEVFSA YSALWWSPNG TFLAYAQFND TEVPLIEYSF 240 

YSDESLQYPK TVRVPYPKAG AVNPTVKFFV VNTDSLSSVT KATSIQ1TAP ASMLIGDHYL 300 

CDVTWATQER ISLQWLRRIQ NYSVMDICDY DESSGRWNCL VARQHIEMST TGWVGRFRPS 360 

EPHFTLDGNS FYKIISNEEG YRHICYFQID KKDCTFITKG TWEVIGIEAL TSDYLYYISN 420 

EYKGMPGGRN LYKIQLIDYT KVTCLSCELN PERCQYYSVS FSKEAKYYQL RCSGPGLPLY 480 

TLHSSVNDKG LRVLEDNSAL DKMLQNVQMP SKKLDFI ILN ETKFWYQMIL PPHFDKSKKY 540 

PLLrLDVYAGP CSQKADTVFR LNWATYLAST ENIIVASFDG RGSGYQGDKI MHAINRRLGT 600 

FEVEDQIEAA RQFSKMGFVD NKRXAIWGWS YGGYVTSMVL GSGSGVFKCG IAVAPVSRWE 660 

YYDSVYTERY MGLPTPEDNL DHYRNSTVMS RAENFKQVEY LLIHGTADDN VHFQQSAQIS 720 
KALVDVGVDF QAKWYTDEDH GIASSTAHQH IYTHMSHFIK QCFSLP 



Nucleic Acid Accession #: 
Coding sequence: 



SEQ ID NO:225 PBJ2 DNA SEQUENCE 

none found 

1-261 (underlined sequences correspond to start and stop codons) 



1 



11 



21 



31 



41 



51 



393 



ATGGCTCTGG CGAAGGTGAG GGAGCCAAAC GCAAATGACA ATGCCATCAG AGTTGACAAC 60 

AGAAGTGTGA TTAAAGTGCG TGCTAACCAG TGTTCCCTGC ATGAGGCAGA AAGTGAATCC 120 

AGAAACCCTC AGGAGCTCTG GATGGGCCTG CTCCTCTTGA TGGGGGTCCT AGAAGCATGT 180 

GTGGAAATGA GGCCTCTGTC AGTCTGGTCC CTGAGAGATG ACAAGGAGCA GAGCCCCCAC 240 
CAGCCCACAC TGGATGTCTA A 



SEQ ID NO:226 PBJ2 Protein sequence: 

Protein Accession #: none found 



SEQ ID NO:228 PBM2 Protein sequence: 

Protein Accession #: none found 



60 
120 
180 
240 
300 
360 
420 



1 11 21 31 41 51 

I I I 1 I s 

MALAKVREPN ANDNAIRVDN RSVIKVRANQ CSLHEAESES RNPQELWMGL LLLMGVLEAC 
VEMRPLSVWS LRDDKEQSPH QPTLDV 

SEQ ID NO:227 PBM2 DNA SEQUENCE 

Nucleic Acid Accession #: none found 

Coding sequence: 1-462 (underlined sequences correspond to start and sto^ codons) 



l 11 21 31 41 51 

III!! 

ATGCCAAATG CTGAGTTAGA AGCAAAGAGC CTTGGAAGCA GTAAATGTTT AAAAACTGCT 
CTCATACTTG CTGTATGTTG TGGATCAGCA AATATAGTCA GCCCTCTACT TGAGCAAAAT 
ATTGATGTAT CTTCTCAAGA TCTGGACAGA CGGCCAGAGA GTATGCTGTT TCTAGTCATC 
ATCATGTGGA CCAGTTTTGT GGAAGACAAT CTTTCCATGG GCTGGGGGAA GC TAG AAG AT 
TTTATGGCTA TTGAAGAAGA AATGAAGAAG CACGGAAGTA CTCATGTGGG ATTCCCAGAA 
AACCTGACTA ATGGTGCCGC TGCTGGCAAT GGTGATGATG GATTAATTCC TCCAAGGAAG 
AGCAGAACAC CTGAAAGCCA GCAATTTCCT GACACTGAGA ATGAAGAGTA TCACAGGTTT 
GTCAAAGATC AGATAGTTGT AGATATGCGG CGTTATTTCT GA 



1 11 21 31 41 51 

I i i \ I I 

MPNAELEAKS LGSSKCLKTA LILAVCCGSA NIVSPLLEQN IDVSSQDLDR RPESMLFLVI 60 
IMWTSFVEDN LSMGWGKLE0 FMAIEEEMKK HGSTHVGFPE NLTNGAAAGN GDDGLIPPRK 120 
SRTPESQQFP DTENEEYHRF VKDQIVVDMR RYF 

SEQ ID NO:229 PEZ2 DNA SEQUENCE 

Nucleic Acid Accession #: NM_014253 

Coding sequence: 65-8242 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

I I I I i i 

GACTGCTTGC ATTAAAGGAC TTCCTCATCC TTTTTTTCAT GAAACTGAGC TTGCTTAATC 60 

AGAGATGGAG CAAAC TG ACT GCAAACCCTA CCAGCCTCTA CCAAAAGTCA AGCATGAAAT 120 

GGATCTAGCT TACACCAGTT CTTCTGATGA GAGTGAAGAT GGAAGAAAAC CAAGACAGTC 180 

ATACAACTCC AGGGAGACCC TGCACGAGTA TAACCAGGAG CTGAGGATGA ATTACAATAG 240 

CCAGAGTAGA AAGAGGAAAG AAGTAGAAAA ATCTACTCAA GAGATGGAAT TCTGTGAAAC 300 

CTCTCACACT CTGTGCTCTG GCTACCAAAC AGACATGCAC AGCGTTTCTC GGCATGGCTA 360 

CCAGCTAGAG ATGGGATCTG ATGTGGACAC AGAGACAGAA GGTGCTGCCT CACCTGACCA 420 

TGCACTAAGA ATGTGGATAA GGGGAATGAA ATCAGAGCAT AGTTCCTGTT TGTCCAGCCG 480 

GGCCAACTCT GCATTATCCT TGACTGACAC TGACCATGAA AGGAAGTCTC ATGGGGAAAA 540 

TGGTTTCAAA TTCTCTCCTG TTTGTTGTGA CATGGAGGCT CAAGCTGGGT CTACTCAAGA 600 

TGTGCAGAGC AGCCCACACA ACCAGTTCAC CTTCAGACCC CTCCCACCGC CACCTCCGCC 660 

TCCTCATGCC TGCACCTGTG CCAGGAAGCC ACCCCCTGCA GCGGACTCTC TTCAGAGGAG 720 

ATCAATGACT ACCCGCAGCC AGCCCAGCCC AGCTGCTCCA GCTCCCCCAA CCAGCACGCA 780 

GGATTCAGTC CATCTGCATA ACAGCTGGGT CCTGAACAGC AACATACCAT TGGAGACCAG 840 

GCATTCCCTG TTCAAACATG GATCTGGTTC CTCTGCGATC TTCAGTGCAG CCAGTCAGAA 900 

CTACCCTCTG ACATCCAATA CCGTGTACTC GCCCCCTCCC AGGCCTCTTC CTCGAAGCAC 960 

CTTTTCCCGA CCTGCCTTTA CCTTTAACAA ACCTTACAGG TGCTGCAACT GGAAGTGCAC 1020 

AGCATTGAGC GCCACTGCAA TCACAGTGAC TTTGGCCTTG TTACTAGCCT ATGTGATTGC 1080 

AGTGCATTTG TTCGGCCTGA CTTGGCAGTT GCAACCAGTT GAAGGAGAGC TGTATGCAAA 1140 

TGGAGTTAGC AAAGGGAACA GGGGGACCGA GTCCATGGAC ACTACTTACT CTCCAATTGG 1200 

AGGAAAAGTT TCTGATAAAT CAGAGAAAAA AGTGTTTCAG AAGGGACGGG CGATAGACAC 1260 

TGGAGAAGTT GACATTGGTG CACAGGTCAT GCAGACCATT CCACCTGGTT TATTCTGGCG 1320 

TTTCCAGATT ACTATCCACC ATCCAATATA TCTGAAGTTC AATATTTCTT TAGCCAAGGA 1380 

CTCTCTGCTG GGAATTTATG GCAGAAGAAA CATTCCACCT ACACATACTC AGTTTGATTT 1440 

TGTAAAACTA ATGGATGGCA AACAGCTGGT CAAGCAGGAC TCCAAGGGCT CTGATGATAC 1500 

ACAGCACTCC CCTCGGAACC TGATCTTAAC TTCGCTTCAG GAGACAGGTT TCATAGAGTA 1560 

TATGGATCAA GGACCTTGGT ATCTGGCGTT TTACAATGAT GGAAAAAAGA TGGAGCAAGT 1620 

ATTCGTGTTA ACTACAGCAA TTGAAATAAT GGATGACTGT TCAACCAATT GCAATGGAAA 1680 

TGGAGAGTGT ATCTCTGGCC ATTGTCATTG TTTCCCAGGA TTCCTTGGAC CTGACTGTGC 1740 

TAGAGATTCC TGCCCTGTGC TGTGTGGTGG GAATGGAGAA TACGAGAAAG GACACTGTGT 1800 

CTGCCGGCAT GGCTGGAAGG GGCCAGAGTG TGACGTTCCG GAAGAACAAT GCATTGATCC 1860 

AACATGCTTT GGCCACGGCA CCTGCATCAT GGGAGTCTGC ATCTGTGTGC CAGGATACAA 1920 



394 



AGGAGAAATA TGCGAGGAAG AGGACTGCCT AGACCCAATG TGTTCCAACC ATGGCATCTG 1980 

TGTAAAAGGA GAATGTCACT GTTCTACTGG CTGGGGAGGA GTTAACTGTG AAACACCACT 2040 

TCCTGTATGT CAAGAGCAGT GCTCAGGACA CGG AACTTTT CTTCTGGACG CTGGAGTATG 2100 

CAGCTGTGAT CCCAAGTGGA CAGGATCTGA CTGCTCAACA GAGCTGTGTA CCATGGAGTG 2160 

TGGTAGCCAT GGAGTCTGCT CAAGAGGAAT TTGCCAGTGT GAAGAAGGCT GGGTAGGACC 2220 

AACATGTGAG GAACGCTCCT GTCATTCTCA TTGTACTGAG CATGGCCAAT GCAAAGATGG 2280 

AAAATGTGAG TGTAGCCCTG GATGGGAGGG CGACCACTGC ACAATTGCTC ACTACTTAGA 2340 

TGCTGTCCGA GATGGCTGCC CAGGGCTCTG CTTTGGAAAT GGACGATGTA CCCTGGATCA 2400 

AAATGGTTGG CACTGTGTGT GTCAGGTGGG TTGGAGTGGG ACAGGCTGCA ATGTTGTCAT 2460 

GGAAATGCTT TGTGGAGATA ACTTGGACAA TGATGGAGAT GGTTTAACCG AC TGTGTGG A 2520 

TCCTGACTGT TGTCAACAAA GCAACTGTTA TATAAGTCCT CTCTGCCAGG GCTCACCAGA 2580 

TCCTCTTGAC CTCATTCAGC AAAGCCAAAC TCTCTTCTCT CAGCACACTT CAAGACTTTT 2 640 

TTATGATCGA ATCAAATTCC TCATTGGCAA GGACAGTACT CATGTCATTC CTCCTGAGGT 2700 

GTCATTTGAC AGCAGGCGTG CCTGTGTGAT TCGAGGCCAA GTGGTGGCCA TAGATGGAAC 2760 

TCCTCTAGTG GGAGTGAATG TCAGTTTCTT GCACCACAGT GATTATGGGT TTACCATCAG 2820 

CCGGCAAGAT GGAAGCTTTG ACCTCGTGGC CATCGGTGGC ATCTCTGTCA TCTTAATCTT 2880 

CGACCGATCC CCTTTCCTGC CTGAGAAGAG AACACTCTGG TTGCCTTGGA ATCAGTTTAT 2940 

TGTGGTAGAG AAAGTCACCA TGCAGAGAGT TGTATCAGAC CCGCCATCCT GCGATATCTC 3000 

CAACTTTATC AGCCCAAACC CTATTGTGCT TCCTTCACCG CTCACATCAT TTGGAGGGTC 3060 

CTGTCCAGAG AGGGGAACTA TTGTTCCTGA GCTGCAGGTT GTACAGGAGG AAATTCCCAT 3120 

TCCCTCCAGC TTTGTGAGGC TGAGTTACCT GAGCAGCCGC ACCCCTGGGT ATAAAACCCT 3180 

GCTACGGATC CTTCTGACAC ATTCAACGAT TCCCGTAGGC ATGATAAAAG TACACCTCAC 3240 

AGTAGCTGTG GAAGGGCGAC TCACACAGAA GTGGTTTCCC GCCGCAATTA ATCTTGTCTA 3300 

CACATTTGCT TGGAACAAGA CCGATATCTA TGGACAGAAG GTTTGGGGCC TGGCAGAGGC 3360 

TTTGGTATCT GTGGGATATG AATATGAAAC GTGCCCTGAC TTTATTCTCT GGGAGCAAAG 3420 

GACAGTCGTT TTACAAGGTT TTGAGATGGA TGCTTCTAAC CTAGGAGACT GGTCTTTGAA 3480 

TAAGCATCAC ATTTTGAATC CTCAAAGTGG AATCATACAT AAAGGGAATG GAGAAAATAT 3540 

GTTCATTTCC CAGCAGCCCC CAGTCATATC AACCATAATG GGTAATGGAC ACCAAAGGAG 3600 

TGTAGCCTGC ACCAACTGCA ATGGCCCAGC CCACAACAAC AAACTCTTTG CTCCTGTCGC 3660 

CTTAGCTTCT GGCCCTGATG GCAGTGTGTA TGTTGGCGAC TTCAATTTTG TAAGGAGAAT 3720 

ATTTCCCTCG GGAAACTCCG TTAGTATTTT GGAATTAAGC ACAAGTCCTG CTCACAAATA 3780 

CTATCTGGCT ATGGACCCTG TGTCTGAATC ACTCTATCTA TCAGACACCA ATACTCGCAA 3840 

AGTCTACAAG TTGAAATCTC TTGTGGAGAC GAAAGATCTG TCCAAGAATT TTGAAGTGGT 3900 

GGCAGGAACT GGTGATCAGT GCCTTCCCTT TGACCAGAGT CATTGTGGAG ATGGTGGGAG 3960 

AGCATCGGAA GCTTCACTGA ATAGCCCTCG AGGCATCACA GTTGATAGGC ATGGATTTAT 4020 

TTACTTTGTG GATGGGACTA TGATTCGCAA AATTGATGAG AATGCTGTGA TCACAACTGT 4080 

AATCGGCTCA AATGGTCTGA CTTCCACACA ACCACTGAGC TGTGACTCAG GAATGGACAT 4140 

CACTCAGGTG CGATTAGAGT GGCCAACAGA CCTTGCAGTA AATCCTATGG ACAATTCATT 4200 

GTATGTCTTG GATAACAACA TTGTGCTGCA AATTTCTGAG AACAGGCGTG TTCGGATCAT 4260 

CGCAGGACGC CCCATTCACT GCCAGGTGCC AGGCATCGAT CATTTCCTGG TCAGCAAGGT 4320 

AGCAATTCAC TCCACTCTAG AGTCAGCGAG GGCCATCAGT GTCTCCCACA GCGGGCTGCT 4380 

CTTCATAGCT GAAACAGACG AGAGGAAAGT AAACCGCATT CAGCAAGTAA CCACCAATGG 4440 

GGAGATCTAC ATCATCGCTG GTGCCCCCAC TGACTGTGAC TGCAAAATTG ATCCAAACTG 4500 

TGACTGTTTT TCAGGTGATG GTGGCTATGC CAAAGATGCA AAGATGAAAG CCCCTTCCTC 4560 

CTTAGCAGTG TCGCCTGATG GAACCCTCTA TGTGGCAGAC CTCGGAAATG TTCGAATTCG 4620 

TACCATCAGC AGGAACCAAG CCCACCTGAA TGACATGAAC ATTTATGAGA TTGCTTCACC 4680 

CGCTGATCAG GAACTGTACC AGTTCACTGT AAATGGAACC CACCTACACA CCCTGAACTT 4740 

GATAACAAGG GACTATGTTT ATAACTTCAC CTACAATTCT GAAGGTGACT TGGGCGCGAT 4800 

TACCAGCAGC AATGGCAATT CAGTGCACAT TCGCCGTGAT GCAGGCGGAA TGCCGCTATG 4860 

GCTTGTGGTG CCTGGCGGAC AAGTATACTG GCTGACTATA AGCAGCAATG GAGTCCTGAA 4920 

AAGAGTGTCA GCCCAAGGCT ATAATCCGGC CTTAATGACC TATCCAGGAA ACACAGGGCT 4980 

TCTGGCTACC AAAAGTAACG AAAATGGATG GACAACCGTT TATGAGTATG ACCCCGAGGG 5040 

ACACCTGACC AATGCAACGT TTCCCACTGG AGAGGTCAGC AGCTTCCACA GTGACCTGGA 5100 

GAAGCTGACA AAAGTGGAGC TAGATACTTC CAACCGTGAA AATGTCCTCA TGTCAACCAA 5160 

CTTGACGGCA ACTAGTACCA TATATATTTT AAAACAAGAA AATACTCAAA GTACCTATCG 5220 

GGTGAATCCA GATGGTTCCC TGCGTGTCAC TTTTGCCAGC GGGATGGAGA TCGGCCTCAG 5280 

CTCAGAGCCC CACATCCTGG CAGGGGCAGT CAACCCTACC CTGGGCAAAT GCAACATCTC 5340 

ATTGCCCGGA GAGCACAATG CAAACCTCAT CGAGTGGCGG CAGAGGAAGG AGCAAAACAA 5400 

AGGCAATGTT TCGGCTTTTG AAAGGAGGCT GAGGGCCCAC AACAGAAACC TACTCTCCAT 5460 

AGATTTTGAT CATATAACCC GCACAGGAAA GATCTATGAT GACCATCGAA AATTCACCCT 5520 

TCGAATTCTT TATGACCAGA CTGGGCGACC CATTCTGTGG TCTCCTGTAA GCAGATATAA 5580 

TGAAGTGAAC ATC AC AT ATT CACCTTCGGG ATTGGTGACG TTTATTCAAA GAGGAACGTG 5640 

GAATGAAAAA ATGGAATATG ACCAGAGTGG GAAAATTATT TCAAGAACTT GGGCTGATGG 5700 

GAAAATTTGG AGCTATACCT ACTTAGAAAA ATCTGTGATG CTTCTCCTAC ACAGCCAGCG 5760 

GCGTTACATC TTTGAGTATG ACCAATCAGA TTGCCTGCTG TCAGTTACCA TGCCTAGCAT 5820 

GGTGCGCCAC AGCTTACAAA CCATGCTTTC AGTGGGCTAC TACCGTAATA TCTACACCCC 5880 

ACCGGACAGT AGCACTTCTT TTATCCAAGA CTATAGTCGA GATGGCCGAT TGCTACAGAC 5940 

CCTGCATCTG GGGACAGGGC GCAGAGTCTT ATACAAGTAC ACCAAGCAAG CAAGGCTTTC 6000 

TGAGGTTCTC TATGATACCA CTCAGGTCAC ATTAACATAT GAAGAGTCTT CTGGAGTGAT 6060 

TAAGACAATA CACCTGATGC ATGACGGATT CATCTGCACA ATCAGATACA GGCAAACAGG 6120 

ACCTCTTATT GGACGCCAGA TTTTCAGATT CAGTGAAGAA GGCCTTGTGA ATGCACGGTT 6180 

CGACTACAGC TACAACAATT TCCGAGTCAC AAGCATGCAA GCTGTAATCA ATGAAACCCC 6240 

TTTGCCTATA GATCTTTACC GATATGTTGA TGTCTCTGGC AGAACAGAGC AGTTTGGAAA 6300 

ATTCAGTGTA ATTAATTACG ATTTAAATCA GGTCATAACT ACTACAGTGA TGAAACACAC 6360 

CAAAATCTTC AGTGCCAATG GACAAGTCAT TGAAGTCCAA TATGAAATCC TAAAGGCAAT 6420 

TGCCTACTGG ATGACCATTC AATATGATAA TGTGGGCCGA CATGGTAATA TGTGCATAAG 6480 

GGTAGGAGTA GATGCCAATA TAACAAGGTA CTTCTATGAA TACGATGCTG ATGGGCAACT 6540 

TCAGACTGTT TCTGTAAATG ACAAAACCCA GTGGCGTTAT AGTTACGATC TGAATGGAGA 6600 

CATCAACCTC TTAAGCCATG GGAAGAGTGC TCGTCTTACT CCTCTCCGAT ATGACCTCCG 6660 

AGACCGCATC ACCAGATTAG G AG AAATTC A GTATAAAATG GATGAAGATG GCTTTCTGAG 6720 
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GCAGAGGGGA AATGATATTT TTGAATATAA 
TAAGGCTTCT GGCTGGACTG TGCAGTATTA 
TAAGTCCAGC CTAGGGCAGC ACCTTCAGTT 
AGTTACTCAT TTGTACAACC ACACAAGCTC 
AGGTCACCTT ATTGCCATGG AGTTAAGCAG 
TACAGGTACC CCACTAGCTG TGTTCAGCAG 
CACACCTTAT GGCGATATCT ATCATGACAC 
TCATGGAGGA CTCTATGATT TCCTTACTAA 
TGTTGTTGCT GGCAGATGGA CAACGGCCTA 
TCCTAAACCA TTCAACCTCT ACTCCTTTGA 
TGTTGCAAAG TATACCACAG ACATCAGAAG 
CAATGTACTA CCTGGATTTC CCAAACCTGA 
TCTACGGCTT CAGACAAAAA CTCAAGAGTG 
GTGTGAACTC CAGAAACAGC TCAGGAATTT 
CCGATACAAT GATGGACGGT GCCTTGAAGG 
TTCTGTTTTT GGGAAAGGTA TAAAATTTGC 
TATAGGAGTA GCCAATGAAG ATAGCAGGCG 
CCTGG AAAAC CTACATTTTA CCATAGAGGG 
GTCTCTGGAG GAAGACCTGG TGCTCATCGG 
TGGTGTC AAT GTCACTGTGT CCCAGATGAC 
TGCAGATATT CAGCTCCAGC ATGGAGCCCT 
CGAAGAGGAA AAGAATCACG TGTTGGAGAT 
GACTAAGGAA CAAAGAAGGC TGCAAGAGGG 
GGAAAAGCAG CAGCTTTTGA GCACTGGGCG 
GTCTGTTGAG CAGTATTTAG AACTTTCTGA 
GAGCGAAATA GGCAGGAGG T AA CAAAAATA 
GTTTTTAAAA CATAAAATGG TTTATTGTAT 
AAATATGGAG GAAAAACATA TCCAACTGCC 
ATTGTTTGTT TAAACTCTTT AAGAAATGAC 
CAAAATAACA CAAGTAGAAC TCAAACAGCT 
ATTTGCCGAG CCATGCATAT GTTCCAATAT 
TTGTGAGAAG CAGTTTCATC CTTAACTGTT 
GTGCAATAGT ATCTGAAACT TGCCTTTCGA 
TCTGTTATAG GAAACTTAAA AACAGGTGTA 
AGGACCCAAT TGCCCTTCCT TCTTGATTAT 
TTGTTGTGCT GTGTTTTGGC GTGTGGTGGC 
TGTGGTAACC AGACTGTATA GCCGCTATTT 
GCCAGCGTGA CCTCTCTCAC ACGACCTGTT 
GCTGTATTGG TATCATGTAA ACATAGCTTT 
ATATAGGATG TGTTTTGGTC ATAGTTTCAC 
AATGGTTTTG TGCACATGAA CGGTAATTTA 
ACAAAGGCTT TAGCAGGCAT ACGTGTCTGG 
AGAAATTCAT AAGAGCCAAA ACCTTAAAAA 
AAGGGAAGAC CAGACCAAAC ATCACAGCAG 
TTTATCTTTC AAATGTACAA TTCTGTATTG 
ATCAAGTAAA TCCTTTCCAA CCGAAAACAT 
TTTACTAAAA TAATTTATAC AGTTAGTTAT 
TATTTAATCG TCTCTACTGC CTAGGAAAAT 
GCGATCATTT AAAATTTGGA GAAAGGTCAG 
AATCTCTAGG AATCCTGCAG TAAAACAAGC 
TGACAAAGAG ATAGTTTGTA AAATGCTGTG 
TGACAGCACA ATGTGGCCCG TAGAAAATTC 
GAATCTGAAC ATTTGCTATG TCTGAAGGCA 
TTCCAGATGC TACCTAAATG CAGTGTGGGG 
TTGAAAATAT GCAAAGTCAT AAGCTCATGT 
CACAAAGGAA AGCAAGGGAA AGGAAATGAC 
AACATTTCAT TTTCAAAACC TTCGGGTTAG 
AGAATTCATG AGGAACTCAT CTC TCTTT AT 
TAATCCATAC TAAAATCATA TTATTGGGTT 
AGTATTTATT CAGAATGGAA TTCTAAAATT 
TCCACACCAA CCTAAAAATG GACCTTAAGT 
ATGGAAAAAT AATTTGTGAA CTGTATATAG 
TTATCACAAA TCCAAAATGT CAATATTAGA 
ACGTTTTTGC AATTCATTGA TGATGTATCA 
ACAAATATTT GAAGCTTTTA CTTAATAGTG 
AATACGTATT TGGTTGGTTC GTGCCTTTAG 
GGAAATGCAC TTTTTATTAC TTACAGCTGT 
TCTTTTTACA ACTCCTAAAG CTTGAGGQAG 
GTAGTAAATC GAAGAGAAAC ATTTTGGCAT 
TATCACTTCC TATTCAGCTG AATAGAAAGA 
TAAATTATTG AAAGAACAAT TCGTTTGCAT 
GAAACATATG AATTTCTCAT ACCCAGCAGA 
ATTCGAGTAA GTTAAAGTGA GAGCATAGTA 
CTGGAGGCAG GGAATACTCC ATGGTTGTTT 
GCTTTTCTGT TTTGTTTTGT TTTCACTCTT 
AAAAGTTCAA AGTTTAACAC ATTTAAATAT 
TTGATTAGAA GCATGACTCC TGAAGGAAAG 
AACAAAACAC TTTTACCATA TAAATAAGTA 
AAAATAAGTG TGTCCTTTAC TGTCAATTTA 
ATATATAATA TATACAACAT AGCCAAATGT 



TTCTAATGGC CTGCTGCAGA AAGCCTACAA 6780 

CTATGATGGG CTTGGGCGAC GTGTCGCGAG 6840 

CTTTGTCGAC GCGACCGCGA ACCCCATAAG 6900 

GGAGATTACA TCTCTGTATT ATGATCTCCA 6960 

TGGTGAAGAA TATTATGTAG CCTGTGATAA 7020 

CCGAGGTCAG GTCATAAAGG AGATACTATA 7080 

TTACCCTGAC TTTCAGGTCA TAATTGGTTT 7140 

ATTAGTGCAC CTGGGGCAAA GGGATTATGA 7200 

TCATCACATA TGGAAACAGT TGAACCTCCT 7260 

AAATAACTAC CCAGTTGGCA AAATTCAAGA 7320 

TTGGTTGGAG CTATTTGGTT TCCAATTACA 7380 

ATTAGAAAAT TTAGAATTAA CTTACGAGCT 7440 

GGATCCTGGA AAGACTATCC TGGGCATTCA 7500 

CATTTCCTTG GACCAACTAC CTATGACTCC 7560 

AGGGAAGCAA CCAAGGTTTG CTGCTGTCCC 7620 

CATCAAGGAT GGCATAGTAA CAGCTGATAT 7680 

GCTTGCTGCC ATTCTCAATA ATGCCCATTA 7740 

GAGGGACACT CACTACTTCA TTAAGCTTGG 7800 

TAACACTGGG GGGAGGCGGA TTCTGGAGAA 7860 

TTCTCTGTTG AATGGGAGGA CTAGACGGTT 7920 

GTGCTTCAAC ATCCGGTATG GGACAACTGT 7980 

TGCCAGACAG CGCGCAGTGG CCCAGGCCTG 8040 

GGAAGAGGGG ATTAGGGCAT GG AC AGAAGG 8100 

GGTACAAGGT TACGATGGGT ATTTTGTTTT 8160 

CAGTGCCAAT AATATTCACT TTATGAGACA 8220 

TCTCTGCCTT TGCGTCACCA AAGACTGCCT 8280 

TGGTTTTCTA GATCAGAACT CTGTATATGT 8340 

TTTCAATGTG ACGGAAGATG GTATTTTAAT 8400 

AGAGATTTTT AGTTCTTGTG TGGCAGTATT 8460 

AAAAACAGTT TTCAGAAAGC ACCACTTTCA 8520 

CCAGAAAGAA CCCAAGGTTC TCTATCTCTA 8580 

GGCAGAACTT ACGGGCTATT TGAATAGGTG 8640 

AAGACTGCCA GCCCTTTGAC GTTTTCCAGA 8700 

AAATGTCTTC AGCCACCATC TCCTAGAGTG 8760 

TCCTCCTTGC TTGTTAAAGT AAATGCCATA 8820 

TGGGTTCTGT CTACCATGCT TCCCTGTGGG 8880 

GCTCGTGTGT ACATGATACC AAAGCAGCTG 8940 

TTGACTCAAT TTTTTACTAA AAGTTGTTCA 9000 

TATTAACCTG GGTAGGAATT TC TC ATTT AT 9060 

ATTAGTGATT CAGTATCTAT ACACTGACCC 9120 

CTTAAAAGTA TGATTCTGGT ACAAAAACAA 9180 

GATGCCGATA CATACATTAA CTACTACTGC 9240 

AATAGACCTG GTACTTAAGT GAAAGTACTA 9300 

TTGCTGCCAC ATTGTTTCAG CCCACTTAGA 9360 

AACATCTCCC AGCCATCTTC AGGAAATCGA 9420 

TTCAACTAAC TATAGAGAGG CAGACTCATT 9480 

TTTCGTTCTC CGTACTTACC CATTTATCTT 9540 

AACTATTTTC CAGGACGGGT TATTTGTTCT 9600 

GATTAGTGTT AATATCAGCT GCAGTTTCTC 9660 

CCCTTGGTGA GCTGGAAGAT TTGTGCCCAG 9720 

TAATTGTAAG TTACCACAAA TGAAAATACA 9780 

CCCTGAGCCA GCTTCTGCAC TTTCATCACC 9840 

AATTTATGAT GGAATGTTAG TTTGGATTCT 9900 

TCATTGCCTT GCTTTGCGAT GACAGTTTCT 9960 

TAAGGTTTTT CAAGAGTCTG CCTCCTACTA 10020 

CCTGGCAAAC AGTAGGGAAG GGTGTATTCA 10080 

AATACCACTT ACACATGTAT TCTGAGAGAC 10140 

AACTGGAAAC ACACCAGCTT GATATATTGC 10200 

TTTTCTGAAT CAGGCCTGTA TTAATGGTAC 10260 

ACTAACAAAC TTGTTGAAAA TTTGAATACC 10320 

TCCTAGAACC TCTGATGTTC TTTTAAATTA 10380 

AGAGTGCATT CATAAATGTG ATTATGTATT 10440 

GTCTATTTTG CTTATATTTT AAGCAATTAT 10500 

TTTTCAAACT GCTTTAAATA TCCATTAGAA 10560 

ATTACCTTGA ACTGTGCATT TCTAGTTTGT 10620 

TTTGTTAAAG TTACATTTGT ATTATATTCA 10680 

GGTTTTAATA CTGCCTTGAA CTATTATTAT 10740 

GAAAGAAAAA AAAAACAAAA CTACTAATCA 10800 

TTCTTAAGAA GAAGATGGAG ATATTGAGTA 10860 

ATGCCTTCAT TGACTTGCAG TTCTGCAGTT 10920 

TTCCTGATGA AAGTAAAAGC ATTTTTCAGA 10980 

CAGATGGCTG ACACTGCACA GCCACACACC 11040 

GTTGGACTCT CCTATGAAGA ACATTCTGGG 11100 

CTTTTTCCTA CTTAAGCCCA TTTTGTTTGT 11160 

GCACTACAGT CTAGAGATCC AAATGAACTG 11220 

GTTTACTTTT AGTTGTCATT CTAATCGTTA 11280 

GGAAATAAAT CTCAATTCAT ACTAACTTGC 11340 

TATGATTTAT TTTTAACCCA AAAAATGTAT 11400 

TCGAGAAGAT CTATAATATA TAGACTACAT 11460 

ATGAAAACTT GACAATGTAT AATTTGGAAT 11520 
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TCACATGCTA CCTATGTAGA CAGGTATGAA ATTAAGTTAT AATTTTCATG AGACATTTTC 11580 
ATCACTGTTG ACACAGTTTC AAGGCATTCC ATCATGTTAT TTTGACTCTT TTTCTTTTTT 11640 
TTTTCTTTAA AAATATATTT TTAACTAGAC CAGGCCCCAC TATAATATCA CTTAAGAGAG 11700 
TCAGGGCAAA GTTTTTGCAT TTATGAAGAT GTGTTCATGT AAGGG TG ATT GTAATGGAGT 11760 
TCATTGGTAA TAGAAGCAAA AGTACAGTAA CGAAGTATTG AAAAGAAAAT TTTGGAGACA 11820 
TTGGAGCATA TTATATATAG CTTGTGGAAA GACATAAGGC TACAGATGGA ATGGAACATT 11880 
CCTGTTTTCT TGAAGAAATT CACATACACA TAGCTGACCT GACTAGTACT TCAGCTCTTC 11940 
CACAGCCTTC TATAAAGGTT CTTTCTTCTG CAAAGAAAAC AAAACAAAAC AAAACAAAAC 12000 
AAAAAAAAAC AAAAAAAGCG CAAAAAACAA AAAAACAAAA AAAAGCAAAG TAAAATTTAA 12060 
AAATACAGAA AACAAACAAC AAAAAAGAAT TCAACCATAA AT AG TG AC T A TTATTTTCAG 12120 
TGTGTCCTTC ATGTGAAAGC TATTAAGGAC CAAATATACT ACTGTTCATA AGAAGAAATT 12180 
ACTTTCTAAA CAGTAACTGA AAATACTTAG AGTTAAACTT GCTGTGGATT TTGTCTTGGC 12240 
AGTTGTCATC TTACATTATT TGTCAAAGGA AATGTGTTTG GCAGTTAAAA ATCTTTCCTT 12300 
AGATTTAGTG GTGGACTTTA ACCTCTTAAA TAAATGTTAG TATATCAGAT TGTGTCCTTG 12360 
AAAAATATTT TACTTGTATG AATCATGACA ACGTCTAAAT CTTTACTATT CTTCTGGCAA 12420 
AAGCATCAGT AAGAAAGAAG GCGAAAAAGA GAAGTATAGC CTTTATGTCA GAAAAACATT 12480 
CTTTTTAGCT GCTTACTTTC TCATGAAAAG TAAAGATGTT TACAGTGTAT GCCAAGTTTT 12540 
CAGTTTCTGT ATAACAACAG GTAGAGGTTC TAATCATATT GAAAATTGTG TTATAATGGT 12600 
CTGAGCCATG TTGCTAGGAA AC AAT AGGTT CCAATTTTGT ATTCCTGCTC TCCTGTGCTG 12660 
AAAAGTGACT GGATACTGTA CAGGTTCATG TTCTCTGGCT GCAGTTAAAT GGTCTTTTGC 12720 
ATTTTGCTCT GGCTTTCAGG CCAGAAGCAT GCATTTTTCT ACAAGAGCAT CACAACAACA 12780 
TGCTGTAAAT ATTTAAAGTT AAACATTATG TGTTGATATT TGAAAGAAAA GTACTTTGAA 12840 
TATTTCATTT TTAAAAAATA AAATTGCCAA TGAAAAAAAA 



SEQ ID NO:2? p PF?? Prgte in sequence: 
Protein Accession #; NP_055068 



l 11 21 31 41 51 

MEQTDCKPYQ PLPKVKHEMD LAYTSSSDES EDGRKPRQSY NSRETLHEYN QELRMNYNSQ 60 

SRKRKEVEKS TQEMEFCETS HTLCSGYQTD MHSVSRHGYQ LEMGSDVDTE TEGAASPDHA 120 

LRMWIRGMKS EHSSCLSSRA NSALSLTDTD HERKSDGENG FKFSPVCCDM EAQAGSTQDV 180 

QSSPHNQFTF RPLPPPPPPP HACTCARKPP PAADSLQRRS MTTRSQPSPA APAPPTSTQD 240 

SVHLHNSWVL NSNIPLETRH SLFKHGSGSS AIFSAASQNY PLTSNTVYSP PPRPLPRSTF 300 

SRPAFTFNKP YRCCNWKCTA LSATAITVTL ALLLAYVIAV HLFGLTWQLQ PVEGELYANG 360 

VSKGNRGTES MDTTYSPIGG KVSDKSEKKV FQKGRAIDTG EVDIGAQVMQ TIPPGLFWRF 420 

QITTHHPIYLi KFNISLAKDS LLGIYGRRNI PPTHTQFDFV KLMDGKQLVK QDSKGSDDTQ 480 

HSPRNLILTS LQETGFIEYM DQGPWYLAFY NDGKKMEQVF VLTTAIEIMD DCSTNCNGNG 540 

ECISGHCHCF PGFLGPDCAR DSCPVLCGGN GEYEKGHCVC RHGWKG PECD VPEEQCIDPT 600 

CFGHGTCIMG VCICVPGYKG EICEEEDCLD PMCSNHGICV KGECHCSTGW GGVNCETPLP 660 

VCQEQCSGHG TFLLDAGVCS CDPKWTGSDC STELCTMECG SHGVCSRGIC QCEEGWVGPT 720 

CEERSCHSHC TEHGQCKDGK CECSPGWEGD HCTIAHYLDA VRDGCPGLCF GNGRCTLDQN 780 

GWHCVCQVGW SGTGCNWME MLCGDNLDND GDGLTDCVDP DCCQQSNCYI SPLCQGSPDP 840 

LDLIQQSQTL FSQHTSRLFY DRIKFLIGKD STHVIPPEVS FDSRRACVIR GQWAIDGTP 900 

LVGVNVSFLH HSDYGFTISR QDGSFDLVAI GGISVILIFD RSPFLPEKRT LWLPWNQFIV 960 

VEKVTMQRW SDPPSCDISN FISPNPIVLP SPLTSFGGSC PERGTIVPEL QWQEEIPIP 1020 

SSFVRLSYLS SRTPGYKTLL RILLTHSTIP VGMIKVHLTV AVEGRLTQKW FPAAINLVYT 1080 

FAWNKTDIYG QKVWGLAEAL VSVGYEYETC PDFILWEQRT WLQGFEMDA SNLGDWSLNK 1140 

HHILNPQSGI IHKGNGENMF ISQQPPVIST IMGNGHQRSV ACTNCNG PAH NNKLFAPVAL 1200 

ASGPDGSVYV GDFNFVRRIF PSGNSVSILE LSTSPAHKYY LAMDPVSESL YLSDTNTRKV 1260 

YKLKSLVETK DLSKNFEWA GTGDQCLPFD QSHCGDGGRA SEASLNSPRG ITVDRHGFIY 1320 

FVDGTMIRKI DENAVITTVI GSNGLTSTQP LSCDSGMDIT QVRLEWPTDL AVNPMDNSLY 1380 

VLDNNIVLQI SENRRVRIIA GRPIHCQVPG IDHFLVSKVA IHSTLESARA ISVSHSGLLF 1440 

IAETDERKVN RIQQVTTNGE IYIIAGAPTD CDCKIDPNCD CFSGDGGYAK DAKMKAPSSL 1500 

AVSPDGTLYV ADLGNVRIRT ISRNQAHLND MNIYEIASPA DQELYQFTVN GTHLHTLNLI 1560 

TRDYVYNFTY NSEGDLGAIT SSNGNSVHIR RDAGGMPLWL WPGGQVYWL TISSNGVLKR 1620 

VSAQGYNPAL MTYPGNTGLL ATKSNENGWT TVYEYDPEGH LTNATFPTGE VSSFHSDLEK 1680 

LTKVELDTSN RENVLMSTNL TATSTIYILK QENTQSTYRV NPDGSLRVTF ASGKEIGLSS 1740 

EPHILAGAVN PTLGKCNISL PGEHNANLIE WRQRKEQNKG NVSAFERRLR AHNRNLLSID 1800 

FDHITRTGKI YDDHRKFTLR ILYDQTGRPI LWSPVSRYNE VNITYSPSGL VTF I QRGTWN 1860 

EKMEYDQSGK IISRTWADGK IWSYTYLEKS VMLLLHSQRR YIFEYDQSDC LLSVTMPSMV 1920 

RHSLQTMLSV GYYRNIYTPP DSSTSFIQDY SRDGRLLQTL HLGTGRRVLY KYTKQARLSE 1980 

VLYDTTQVTL TYEESSGVIK TIHLMHDGFI CTIRYRQTGP LIGRQIFRFS EEGLVNARFD 2040 

YSYNNFRVTS MQAVINETPL PIDLYRYVDV SGRTEQFGKF SVINYDLNQV ITTTVMKHTK 2100 

IFSANGQVIE VQYEILKAIA YWMTIQYDNV GRHGNMCIRV GVDANI TRYF YEYDADGQLQ 2160 

TVSVNDKTQW RYSYDLNGDI NLLSHGKSAR LTPLRYDLRD RITRLGEIQY KMDEDGFLRQ 2220 

RGND I F E YNS NGLLQKAYNK ASGWTVQYYY DGLGRRVASK SSLGQHLQFF VDATANPIRV 2280 

THLYNHTSSE ITSLYYDLQG HLIAMELSSG EEYYVACDNT GTPLAVFSSR GQVIKEILYT 2340 

PYGDIYHDTY PDFQVIIGFH GGLYDFLTKL VHLGQRDYDV VAGRWTTAYH HIWKQLNLLP 2 400 

KPFNLYSFEN NYPVGK I QDV AKYTTDIRSW LELFGFQLHN VLPGFPKPEL ENLELTYELL 2460 

RLQTKTQEWD PGKTILGIQC ELQKQLRNFI SLDQLPMTPR YNDGRC LEGG KQPRFAAVPS 252 0 

VFGKGIKFAI KDGIVTADII GVANEDSRRL AAILNNAHYL ENLHFTIEGR DTHYFIKLGS 2580 

LEEDLVL1GN TGGRRILENG VNVTVSQMTS LLNGRTRRFA DIQLQHGALC FNIRYGTTVE 2640 

EEKNHVLEIA RQRAVAQAWT KEQRRLQEGE EGIRAWTEGE KQQLLSTGRV QGYDGYFVLS 2700 
VEQYLELSDS ANNIHFMRQS EIGRR 

SEQ ID NO:231 PFD4 DNA SEQUENCE: 

Nucleic Acid Accession #: NM_000441 



397 



Coding sequence: 



225-2567 (underlined sequences correspond to start and stop codons) 



1 11 21 31 41 51 

} [ I i t I 

CTCAGCCTTC CCGGTTCGGG AAAGGGGAAG AATGCAGGAG GGGTAGGATT TCTTTCCTGA 60 

TAGGATCGGT TGGGAAAGAC CGCAGCCTGT GTGTGTCTTT CCCTTCGACC AAGGTGTCTG 120 

TTGCTCCGTA AATAAAACGT CCCACTGCCT TCTGAGAGCG CTATAAAGGC AGCGGAAGGG 180 

TAGTCCGCGG GGCATTCCGG GCGGGGCGCG AGCAGAGACA GGTCATGGCA GCGCCAGGCG 240 

GCAGGTCGGA GCCGCCGCAG CTCCCCGAGT ACAGCTGCAG CTACATGGTG TCGCGGCCGG 300 

TCTACAGCGA GCTCGCTTTC CAGCAACAGC ACGAGCGGCG CCTGCAGGAG CGCAAGACGC 360 

TGCGGGAGAG CCTGGCCAAG TGCTGCAGTT GTTCAAGAAA GAGAGCCTTT GGTGTGC TAA 420 

AGACTCTTGT GCCCATCTTG GAGTGGCTCC CCAAATACCG AGTCAAGGAA TGGCTGCTTA 480 

GTGACGTCAT TTCGGGAGTT AGTACTGGGC TAGTGGCCAC GCTGCAAGGG ATGGCATATG 540 

CCCTACTAGC TGCAGTTCCT GTCGGATATG GTCTCTACTC TGCTTTTTTC CCTATCCTGA 600 

CATACTTTAT CTTTGGAACA TCAAGACATA TCTCAGTTGG ACCTTTTCCA GTGGTGAGTT 660 

TAATGGTGGG ATCTGTTGTT CTGAGCATGG CCCCCGACGA ACACTTTCTC GTATCCAGCA 720 

GCAATGGAAC TGTATTAAAT ACTACTATGA TAGACACTGC AGCTAGAGAT ACAGCTAGAG 780 

TCCTGATTGC CAGTGCCCTG ACTCTGCTGG TTGGAATTAT ACAGTTGATA TTTGGTGGCT 840 

TGCAGATTGG ATTCATAGTG AGGTACTTGG CAGATCCTTT GGTTGGTGGC TTCACAACAG 900 

CTGCTGCCTT CCAAGTGCTG GTCTCACAGC TAAAGATTGT CCTCAATGTT TCAACCAAAA 960 

ACTACAATGG AGTTCTCTCT ATTATCTATA CGCTGGTTGA GATTTTTCAA AATATTGGTG 1020 

ATACCAATCT TGCTGATTTC ACTGCTGGAT TGCTCACCAT TGTCGTCTGT ATGGCAGTTA 1080 

AGGAATTAAA TGATCGGTTT AGACACAAAA TCCCAGTCCC TATTCCTATA GAAGTAATTG 1140 

TGACGATAAT TGCTACTGCC ATTTCATATG GAGCCAACCT GGAAAAAAAT TACAATGCTG 1200 

GCATTGTTAA ATCCATCCCA AGGGGGTTTT TGCCTCCTGA ACTTCCACCT GTGAGCTTGT 1260 

TCTCGGAGAT GCTGGCTGCA TCATTTTCCA TCGCTGTGGT GGCTTATGCT ATTGCAGTGT 1320 

CAGTAGGAAA AGTATATGCC ACCAAGTATG ATTACACCAT CGATGGGAAC CAGGAATTCA 1380 

TTGCCTTTGG GATCAGCAAC ATCTTCTCAG GATTCTTCTC TTGTITTTGTG GCCACCACTG 1440 

CTCTTTCCCG CACGGCCGTC CAGGAGAGCA CTGGAGGAAA GACACAGGTT GCTGGCATCA 1500 

TCTCTGCTGC GATTGTGATG ATCGCCATTC TTGCCCTGGG GAAGCTTCTG GAACCCTTGC 1560 

AGAAGTCGGT CTTGGCAGCT GTTGTAATTG CCAACCTGAA AGGGATGTTT ATGCAGCTGT 1620 

GTGACATTCC TCGTCTGTGG AGACAGAATA AGATTGATGC TGTTATCTGG GTGTTTACGT 1680 

GTATAGTGTC CATCATTCTG GGGCTGGATC TCGGTTTACT AGCTGGCCTT ATATTTGGAC 1740 

TGTTGACTGT GGTCCTGAGA GTTCAGTTTC CTTCTTGGAA TGGCCTTGGA AGCATCCCTA 1800 

GCACAGATAT CTACAAAAGT ACCAAGAATT ACAAAAACAT TGAAGAACCT CAAGGAGTGA 1860 

AGATTCTTAG ATTTTCCAGT CCTATTTTCT ATGGCAATGT CGATGGTTTT AAAAAATGTA 1920 

TCAAGTCCAC AGTTGGATTT GATGCCATTA GAGTATATAA TAAGAGGCTG AAAGCGCTGA 1980 

GGAAAATACA GAAACTAATA AAAAGTGGAC AATTAAGAGC AACAAAGAAT GGCATCATAA 2040 

GTGATGCTGT TTCAACAAAT AATGCTTTTG AGCCTGATGA GGATATTGAA GATCTGGAGG 2100 

AAC TTG AT AT CCCAACCAAG GAAATAGAGA TTCAAGTGGA TTGGAACTCT GAGCTTCCAG 2160 

TCAAAGTGAA CGTTCCCAAA GTGCCAATCC ATAGCCTTGT GCTTGACTGT GGAGCTATAT 2220 

CTTTCCTGGA CGTTGTTGGA GTGAGATCAC TGCGGGTGAT TGTCAAAGAA TTCCAAAGAA 2280 

TTGATGTGAA TGTGTATTTT GCATCACTTC AAGATTATGT GATAGAAAAG CTGGAGCAAT 2340 

GCGGGTTCTT TGACGACAAC ATTAGAAAGG ACACATTCTT TTTGACGGTC CATGATGCTA 2400 

TACTCTATCT ACAGAACCAA GTGAAATCTC AAGAGGGTCA AGGTTCCATT TTAGAAACGA 2460 

TCACTCTCAT TCAGGATTGT AAAGATACCC TTGAATTAAT AGAAACAGAG CTGACGGAAG 2520 

AAGAACTTGA TGTCCAGGAT GAGGCTATGC GTACACTTGC ATCCTGAAAG TGGGTTCGGG 2580 

AGGTCTCTAT GAGCAAGGAA TACAAGACAA AACTTCCTCA ATGCATTGAC TATTTCTTCA 2640 

GACTCAAAAC ACTCATTCTT TTTTCTATTA AGCCATTGAA AGAGAAGCAC TAAGACTGCT 2700 

TCTAGGCTTT ATTTATAAAA TAAACACCTT ATCCCTAACA TGGGCAAAAT GGCTAGAATT 2760 

ATTCAGACGA TTTGGCAGCG TCCAGGGTAA GCTGGTGTTA TAATACGCTG CTGATCTACA 2820 

TCACAGATTT GCTAATAATG TTCACGTGGG CCCTGGCATA TCTCTGTTCA GTTAGAGTGA 2880 

GTGCTGACCC AACAGCCTCT GTGGTCAAGC GAGTCACGAA TGATTAATCA TAAAGAAAAA 2940 

TCAGTTTTTG ACTGACCTGG ATATCCATGA GCTGCACTGA TCACCATGTA AGGTCACATT 3000 

TAGTAAATGC TGAAATAAAA TGATTAATGC ATTTATCAAT AAAAGCCTTT GAAAATACTT 3060 

TGGATAATAA ATTGGAGTTT TAAAAATGCA AATTTGCTTA GTATCTAATA ATGAAGTGTT 3120 

ATTACATATA GCCGGAATTG AGGATCTCTT TGATCCTGGA AATGGTTTAC CTAAAAGCTA 3180 

CAGAACCAGG CCAATATATT TTGAAATATT GATGCAGACA AATGAAATAA TAAAGAGATT 3240 

TTCATGGTTT ATAAAAATCT TTTTTGATAT GATAATAATC ATGATCACAA CTGAGATCAA 3300 

AAAAATATAT GACAGATTAT TTTGTTTAAA AATGCAGTTT TAATTATCTT AGTCTATAGA 3360 

AATGATCATT GCATGGAGGC ATGTATAGGT ATGATCTGTG TAAAATCTGA CATAAAAACA 342 0 

GTGCTATTCT GAGTGAAAAT TTTTTTGATG TGCTTACATA ACCATGGTGA TTAAAATGAG 3480 

TTTATATTTT TTCTCAAAAA TTTTAGCAGT GTGTAAAGTA AGTAATCTTT AACTGAACTC 3540 

TGACCACTTA AAAAAAAATC TAAAAATTGA ACTACCTATA GTAGTCTGTG TTTAAAGTGA 3600 

ATTTTTAAAG ACAAAGCATT CTAAATGAAC TCAATATAAA AACATTCATT TGGAATGTAC 3660 

ATACTGAAAA ATACAGGTTT TTTTGACCAA AAGTTTTTAT ATCTTTTCTT TTTATTTATT 3720 

TTTTTCCTAA GTGCCAACAA TTTTCTAGAT ATTATATACA ACACAGGCTT TGATCTTGGG 3780 

GACTTTTCCC ATATATTTCA CACTGGAGTG AATGAAGTTG TACTTCATTT CTAGAGAAAA 3840 

GTTATACCCA GGTCCCCAAT TGAGAATGTC TTGCTTGATT GAAAACGACA TCATCCCTTG 3900 

GTATACTCCA GGGATTGGTT TCAGGACCCC TGCATTTACC AAAATTTGTG CACACTCAAG 3960 

TCCTGCAGTC ACCCCTGCCT AAAGATAGAA TGGCTTCTCT GTTTTTCTTC TGAAATACAA 4020 

CCAGAAACAA TGTGTCTATT TCTGAAAGAA TAGGATTAAT G ATC AT AC AA ATGGGTTAAT 4080 

CCTGAATTCT GGTTGTAAAT CTGGTTACAG CATAACTAGG ATTATAATGC TGCCTCATTT 4140 

TCACAGCACT AC TTGCTT AT ATTGACAACA AATCATCTCG CTAAAGAGTG AATGTAGGCC 4200 

AGGCGCGGTG GCTCATGCCT GTAATCCCAG CACTTTGGGA GGCCGAGGCG GGTGGATCAC 4260 

GAGGTCAGGA GATCGAGACC ATCCTGGCTA ACATGGTAAA ACCCCGTCTC TACTAAAAAT 4320 

AGAAAAAAAG AAATTAGCCT AGCGTGGTGG CTGGCGGGCG CCTGTAGTCC CAGCTATTTG 4380 

GGAGGCTAAG GCAGGAGAAT GGCGTGAACC CGGGAGGCGG AGCTTGCAGT GAGCCGAGGT 4440 

CGTGCCACTG CACTCCAGCC TGGGCGACAG AGCAAGACTC CGTCTCAAAA AAAAAAAAAA 4500 



398 



AAAAAAAAAA 
AAAGGAAATA 
GGCTAGAGTT 
TACTGTCTCT 
GAAAATTTCA 
TCCAGTATTG 
TTTGCACACA 
CTGAACAAAA 



AGAGTGAATG 
TGCACTGCTC 
TGTAAATTCT 
TCTATGTATT 
CTTGAAATTA 
TATATGAGTT 
TTTAAAAATA 



TAATAGTCTT 
ACTTTTTTGA 
GGGTTCATTT 
TTGTGAATAG 
AAGCTGCCTT 
TTAACAAATT 
AATGTAAAGT 



GCAGAAAATG 
AGGAAATGCC 
GTGATGACAT 
TAAGCATAAT 
TTGTTATATT 
AAAAAATCAA 
TGTCTTTTAA 



AATGAATACC 
AAAGTTACGT 
AAGTCAGCAA 
TTTAGTTTTG 
TTTAACCTAT 
ATCATGTACA 
ACTACTCGGA 



TTTGTTCAAT 
TTTACAACAA 
ACTGCGGGAA 
TATTATCAAT 
AGGATAAGAT 
TTTGAAAATA 
TGTGTCCTTT 



4560 
4620 
4680 
4740 
4800 
4860 
4920 



SEQ ID NO;232 PFD4 Protein sequence: 

Protein Accession #: 04351 1 



MAAPGGRSEP 
AFGVLKTLVP 
FFPILTYFIF 
RDTARVLIAS 
NVSTKNYNGV 
PIEVIVTIIA 
YAIAVSVGKV 
QVAGIISAAI 
IWVFTCIVSI 
EPQGVKILRF 
KNGIISDAVS 
DCGAISFLDV 
TVHDAILYLQ 
QDEAMRTLAS 



11 

i 

PQLPEYSCSY 
I LEWLPKYRV 
GTSRHISVGP 
ALTLLVGIIQ 
LSIIYTLVEI 
TAISYGANLE 
YATKYDYTID 
VMIAILALGK 
ILGLDLGLLA 
SSPIFYGNVD 
TNNAFEPDED 
VGVRSLRVIV 
NQVKSQEGQG 



21 
i 

HVSRPVYSEL 
KEWLLSDVIS 
FPWSLMVGS 
LIFGGLQIGF 
FQNIGDTNtA 
KNYNAGIVKS 
GNQEFIAFGI 
LLEPLQKSVL 
GLIFGLLTW 
GFKKCIKSTV 
IEDLEELDIP 
KEFQRIDVNV 
SILETITLIQ 



31 

i 

AFQQQHERRL 
GVSTGLVATL 
WLSMAPDEH 
IVRYLADPLV 
DFTAGLLTIV 
IPRGFLPPEL 
SNIFSGFFSC 
AAWIANLKG 
LRVQFPSWNG 
GFDAIRVYNK 
TKEIEIQVDW 
YFASLQDYVI 
DCKDTLELIE 



41 

I 

QERKTLRESL 
OGMAYALLAA 
FLVSSSNGTV 
GGFTTAAAFQ 
VCMAVKE LND 
PPVSLFSEML 
FVATTALSRT 
MFMQLCDIPR 
LGSIPSTDIY 
RLKALRKIQK 
NSELPVKVNV 
EKLEQCGFFD 
TELTEEELDV 



51 
I 

AKCCSCSRKR 
VPVGYGLYSA 
LNTTMIDTAA 
VLVSQLKIVL 
RFRHKIPVPI 
AASFSIAWA 
AVQESTGGKT 
LWRQNKIDAV 
KSTKNYKNIE 
LIKSGQLRAT 
PKVPIHSLVL 
DNIRKDTFFL 
QDEAMRTLAS 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 



Nucleic Acid Accession #: 
Coding sequence: 



SEQ ID NO:233 PFH2 DNA SEQUENCE: 

NM_016029 

228-1097 (underlined sequences correspond to start and stop codons) 



1 
I 

CTGCGATCCC 
GGGCGTGCGC 
TGCTCCTGCT 
TATGGGCCGA 
TGACTGGAGC 
TTTCTCTTGT 
TAGAGAATGG 
CTGGTTCCCA 
TGGTCAACAA 
ACAGAAAGCT 
CTCACATGAT 
TATCTGTACC 
ATGGCCTTCG 
GACCTGTGCA 
GCAATAATGG 
TCAGCATGGC 
CATATTTGTG 
AAAGGATTGA 
AGACAAAACA 
AAACATGAAA 
ACTTTTTAAT 
AGATTGCCAT 



11 
I 

GCAGGGCAGC 
GGCCGCAATG 
CTTGGTGCAG 
GTGGCAGGGA 
CTCGAGTGGA 
GCTGTCAGCC 
CAATTTAAAA 
TGAAGCGGCT 
TGGTGGAATG 
AATAGAGCTT 
CGAGAGGAAG 
TCTTTCCATT 
AACAGAACTT 
ATCAAATATT 
AGACCAGTCC 
CAATGATTTG 
GCAATACATG 
GAACTTTAAG 
TGACTGAAAA 
ACAGCAATCT 
AGATATGACT 
GAATCTTGCA 



21 

I 

GACGCGACTC 
AACTGGGAGC 
CTGCTGCGCT 
CGACGCCCAG 
ATTGGTGAGG 
AGAAGAGTGC 
GAAAAAGATA 
ACCAAAGCTG 
TCCCAGCGTT 
AACTACTTAG 
CAAGGAAAGA 
GGATACTGTG 
GCCACATACC 
GTGGAGAATT 
CACAAGATGA 
AAAGAAGTTT 
CCAACCTGGG 
AGTGGTGTGG 
GAGCACCTGT 
TCTTATGCTT 
TTGCTTCCAA 
AA 



31 
I 

TGGTGCGGGC 
TGCTGCTGTG 
TCCTGAGGGC 
AATGGGAGCT 
AGCTGGCTTA 
ATGAGCTGGA 
TACTTGTTTT 
TTCTCCAGGA 
CTCTGTGCAT 
GGACGGTGTC 
TTGTTACTGT 
CTAGCAAGCA 
CAGGTATAAT 
CCCTAGCTGG 
CAACCAGTCG 
GGATCTCAGA 
CCTGGTGGAT 
ATGCAGACTC 
ACTTTTCAAG 
CTGAATAATC 
CATGGAATGA 



41 

i 

CGTCTTCTTC 
GCTGCTGGTG 
TGACGGCGAC 
GACTGATATG 
CCAGTTGTCT 
AAGGGTGAAA 
GCCCCTTGAC 
GTTTGGTAGA 
GGATACCAGC 
CTTGACAAAA 
GAATAGCATC 
TGCTCTCCGG 
AGTTTCTAAC 
AGAAGTCACA 
TTGTGTGCGG 
ACAACCTTTC 
AACCAACAAG 
TTCTTATTTT 
CCACTGGAGG 
AAAGACTAAT 
AATAAAAAAT 



51 

1 

CCCCCGAGCT 
CTGTGCGCGC 
CTGACGCTAC 
GTGGTGTGGG 
AAACTAGGAG 
AGAAGATGCC 
CTGACCGACA 
ATCGACATTC 
TTGGATGTCT 
TGTGTTCTGC 
CTGGGTATCA 
GGTTTTTTTA 
ATTTGCCCAG 
AAGACTATAG 
CTGATGTTAA 
TTGTTAGTAA 
ATGGGGAAGA 
AAAATCTTTA 
GAGAAATGGA 
TTGTGATTTT 
AAATAATAAA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 



SEQ ID NO:234 PFH2 Protein sequence: 
Protein Accession #: 



NP_057113 



MNWELLLWLL 
GIGEELAYQL 
ATKAVLQEFG 
KQGKIVTVNS 
IVENSLAGEV 
MPTWAWWITN 



11 

1 

VLCALLLLLV 
SKLGVSLVLS 
RIDILVNNGG 
ILGIISVPLS 
TKTIGNNGDQ 
KMGKKRIENF 



21 



31 



41 



51 



QLLRFLRADG DLTLLWAEWQ GRRPEWELTD MWWVTGASS 
ARRVHELERV KRRCLENGNL KEKDILVLPL DLTDTGSHEA 
MSQRSLCMDT SLDVYRKLIE LNYLGTVSLT KCVLPHMIER 
IGYCASKHAL RGFFKGLRTE LATYPGIIVS NICPGPVQSN 
SHKMTTSRCV RLMLISMAND LKEVWISEQP FLLVTYLWQY 
KSGVDADSSY FKIFKTKHD 



60 
120 
180 
240 
300 



Nucleic Acid Accession #: 



NM_000450 



SEQ ID NO:235 ACC5 DNA SEQUENCE 

399 



Attorney Docket No.; 018501-004200US 



Coding sequence: 



1-1833 (underlined sequences correspond to start and stop codons) 



10 



15 



20 



25 



30 



35 



ATGATTGCTT 
GCCTGGTCTT 
CAGCAAAGGT 
TCCATATTGA 
TGGGTCTGGG 
GAAC CCAAC A 
GATGTGGGCA 
GCTGCCTGTA 
TACACTTGCA 
ACAGCCCTGG 
AGCTACAATT 
ACCATGCAGT 
GAGTGTGATG 
AGCTTCCCAT 
GCCCAGAGCC 
GCTGTGACAT 
CCTGCTGGAG 
TTGCAGGGAC 
GTTTGTGAAG 
CTTCCTAGTG 
GGTTTTGTGT 
GAGAAGCCCA 
GTGAGGTGTG 
TGTGAGGAGG 
TGGACAGAAG 
AAGATCAACA 
CCTGAAGGAT 
TCTGGCCTGC 
CTTTCTGCTG 
TGCTTACGGA 
GGAAGCTACC 



11 

I 

CACAGTTTCT 
ACAACACCTC 
ACACACACCT 
GCTATTCACC 
TAGGAACCCA 
ATAGGCAAAA 
TGTGGAATGA 
CCAATACATC 
AGTGTGACCC 
AATCCCCTGA 
CTTCCTGCTC 
GTATGTCCTC 
CTGTGACAAA 
GGAACACAAC 
TTCAGTGTAC 
GCAGGGCCGT 
AGTTCACCTT 
CAGCCCAGGT 
CTTTCCAGTG 
CTTCTGGCAG 
TGAAGGGATC 
CATGTGAAGC 
CTCATTCCCC 
GATTTGAATT 
AGGTTCCTTC 
TGAGCTGCAG 
GGACGCTCAA 
TACCTACCTG 
CTGGACTCTC 
AAGCAAAGAA 
AAAAGCCTTC 



21 

t 

CTCAGCTCTC 
CACGGAAGCT 
GGTTGCAATT 
AAGTTATTAC 
GAAACCTCTG 
AGATGAGGAC 
TGAGAGGTGC 
CTGCAGTGGC 
TGGCTTCAGT 
GCATGGAAGC 
TATCAGCTGT 
TGGAGAATGG 
TCCAGCCAAT 
CTGTACATTT 
CTCATCTGGG 
CCGCCAGCCT 
CAAATCATCC 
TGAATGCACC 
CACAGCCTTG 
TTTCCGTTAT 
CAAAAGGCTC 
TGTGAGATGC 
TATTGGAGAA 
ATATGGATCA 
CTGCCAAGTG 
TGGGGAGCCC 
TGGCTCTGCA 
TGAAGCTCCC 
CCTCCTGACA 
ATTTGTTCCT 
TTACATCCTT 



31 

i 

ACTTTGGTGC 
ATGACTTATG 
CAAAACAAAG 
TGGATTGGAA 
ACAGAAGAAG 
TGCGTGGAGA 
AGCAAGAAGA 
CACGGTGAAT 
GGACTCAAGT 
CTGGTTTGCA 
GATAGGGGTT 
AGTGCTCCTA 
GGGTTCGTGG 
GACTGTGAAG 
AATTGGGACA 
CAGAATGGCT 
TGCAACTTCA 
ACTCAAGGGC 
TCCAACCCCG 
GGGTCCAGCT 
CAATGTGGCC 
GATGCTGTCC 
TTCACCTACA 
ACTCAACTTG 
GTAAAATGTT 
GTGTTTGGCA 
GCTCGGACAT 
ACTGAGTCCA 
TTAGCACCAT 
GCCAGCAGCT 
TAA 



41 

I 

TTCTCATTAA 
ATGAGGCCAG 
AAGAGATTGA 
TCAGAAAAGT 
CCAAGAACTG 
TCTACATCAA 
AGCTTGCCCT 
GTGTAGAGAC 
GTGAGCAAAT 
GTCACCCACT 
ACCTGCCAAG 
TTCCAGCCTG 
AATGTTTCCA 
AAGGATTTGA 
ACGAGAAGCC 
CTGTGAGGTG 
CCTGTGAGGA 
AGTGGACACA 
AGCGAGGCTA 
GTGAGTTCTC 
CCACAGGGGA 
ACCAGCCCCC 
AGTCCTCTTG 
AGTGCACATC 
CPAGCCTGGC 
CTGTGTGCAA 
GTGGAGCCAC 
ACATTCCCTT 
TTCTCCTCTG 
GCCAAAGCCT 



51 

i 

AGAGAGTGGA 
TGCTTATTGT 
GTACCTAAAC 
CAACAATGTG 
GGCTCCAGGT 
GAGAGAAAAA 
ATGCTACACA 
CATCAATAAT 
TGTGAACTGT 
GGG AAACTTC 
CAGCATGGAG 
CAATGTGGTT 
AAACCCTGGA 
ACTAATGGGA 
AACGTGTAAA 
CAGCCATTCC 
AGGCTTCATG 
GCAAATCCCA 
CATGAATTGT 
CTGTGAGCAG 
GTGGGACAAC 
GAAGGGTTTG 
TGCC TTCAGC 
TCAGGGACAA 
AGTTCCGGGA 
GTTCGCCTGT 
AGGACACTGG 
GGTAGCTGGA 
GCTTCGGAAA 
TGAATCAGAC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 



40 
45 
50 
55 
60 
65 
70 
75 
80 



SEQ ID NO:236 ACC5 Protein sequence: 
Protein Accession* NPJXJ0441 



MIASQFLSAL 
SILSYSPSYY 
DVGMWNDERC 
TALESPEHGS 
ECDAVTNPAN 
AVTCRAVRQP 
VCEAF QCTAL 
EKPTCEAVRC 
WTEEVPSCQV 
SGLLPTCEAP 
GSYQKPSYIL 



11 
I 

TLVLLIKESG 
WIGIRKVNNV 
SKKKLALCYT 
LVCSHPLGNF 
GFVECFQNPG 
QNGSVRCSHS 
SNPERGYMNC 
DAVHQPPKGL 
VKCSSLAVPG 
TESNIPLVAG 



21 
i 

AWSYNTSTEA 
WWGTQKPL 
AACTNTSCSG 
SYNSSCSISC 
SFPWNTTCTF 
PAGEFTFKSS 
LPSASGSFRY 
VRCAHSPIGE 
KINMSCSGEP 
LSAAGLSLLT 



31 

I 

MTYDEASAYC 
TEEAKNWAPG 
HGECVETINN 
DRGYLPSSME 
DCEEGFELMG 
CNFTCEEGPM 
GSSCEFSCEQ 
FTYKSSCAFS 
VFGTVCKFAC 
LAPFLLWLRK 



41 

i 

QQRYTHLVAI 
EPNNRQKDED 
YTCKCDPGFS 
TMQCMSSGEW 
AQSLQCTSSG 
LQGPAQVECT 
GFVLKGSKRL 
CEEGFELYGS 
PEGWTLNGSA 
CLRKAKKFVP 



51 

i 

QNKEEIEYLN 
CVEIYIKREK 
GLKCEQIVNC 
SAPIPACNW 
NWDNEKPTCK 
TQGQWTQQIP 
QCG PTGEV7DN 
TQLECTSQGQ 
ARTCGATGHW 
ASSCQSLESD 



Nucleic Acid Accession #: 
Coding sequence: 



ATGATGTGTG 
CAAAGCAGTG 
GAAAGGGATC 
CAAAGACTTC 
CTGCCACAGG 
CCGGAATTTG 
GAAGAAGAAA 
TTGGAGTGCC 
GCCCAGTCTC 
TTTGAGCACC 
AGAGTCTCTG 
GAACAAAATG 
CATCTTGAAG 
ATAGACTCAA 
AACTATGAAA 
GTGGAACAGG 
AAGTATCAAA 
ACAACCCTTG 



11 

I 

AAGTGATGCC 
GCTCGGACTC 
GTCTTCTAGA 
AGGATGTCAT 
ATATCGAATC 
CTGCACTGAC 
TCTCTGAACT 
TTGTGTCACG 
CCTCAGGAGT 
ACAAGGCCTT 
CACTGGAAGA 
TTCATATACA 
GGATGGAACC 
CCGATGAAAC 
TGGCCCAGAT 
AAGCAGAGAC 
GGGACATTAG 
AAAAGCGTTA 



21 
I 

CACGATTAAT 
AGACTCCCAT 
CACCCTTCGG 
CTATX5ACCGA 
CCTAACAGGA 
AAAAGAATTA 
TAAAGCTGAA 
ACATGAAAGA 
ATCCAGTGAA 
GGATGAAAAG 
AGAACTAGCT 
AAGAAAAATG 
TGGACAGAAA 
TAGTCAAATA 
GAAAGAACGT 
AGCAAGAAAG 
GGAGGCCATG 
CCTCAGTGCT 



31 

I 

GAGGACACCC 
TTTGAGCAGC 
GAGACCCAGG 
GACTCACTCC 
GGGCTGGCTG 
AATGCCTGCA 
AGAAACAACA 
TCACTAAGAA 
GTTGAAGTTC 
GTAAGGGAGC 
GCTGCTAATC 
GCATCAAGCG 
GTCCATGAGA 
GTTGAACTAC 
TTAGCAGCCC 
GATCTCATTA 
GCACAAAAGG 
CAGAGAGAAT 



41 

I 

CAATGAGCCA 
TGATGGTGAA 
AAAGCCTCTC 
AGAGACAGCT 
GTTCTAAGGG 
GGGAACAACT 
CAAGACTATT 
TGACGGTGGT 
TCAAGGCACT 
GACTGAGGGT 
AGGAGATTGT 
AGGGATCCAC 
AGCGTTTGTC 
AAGAATTGCT 
TTTCTTCCCG 
AAACAGAAGA 
AAGATATGGA 
CTACCTCCAT 



51 

1 

AAGGGGGTCC 
TATGCTAGAT 
ACTTGCCCAG 
CAATTCAGCC 
GGCTGATCCA 
TCTAGAAAAG 
ACTGGAGCAT 
AAAACGGCAA 
GAAATCTTTG 
TTCTTTAGAA 
TGCCTTGCGT 
AGAGTCAGAA 
CAATGGTTCT 
TGAAAAGCAA 
AGTGGGAGAG 
AATGAACACC 
AGAAAGAATT 
ACATGACATG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 



SEQ ID NO:237 PM23 DMA SEQUENCE 

N51002 

1-3793 (underlined sequences correspond to start and stop codons) 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 



400 



AATGATAAAC TAGAAAATGA GTTAGCAAAT AAAGAAGCTA TCCTACGGCA GATGGAAGAG 1140 

AAAAACAGAC AGTTACAAGA ACGTCTTGAG CTAGCTGAAC AAAAGTTGCA GCAGACCATG 1200 

AGAAAGGCTG AAACCTTGCC TGAAGTAGAG GCTGAACTGG CTCAGAGAAT TGCAGCCCTA 1260 

ACCAAGGCTG AAGAGAGACA TGGAAATATT GAAGAACGTA TGAGACATTT AGAGGGTCAA 1320 

CTTGAAGAGA AGAATCAAGA ACTTCAAAGA GCTAGGCAAA GAGAGAAAAT GAATGAGGAG 1380 

CATAACAAGA GATTATCGGA TACGGTTGAT AGACTTCTGA CTGAATCCAA TGAACGCCTA 1440 

CAACTACACT TAAAGGAAAG AATGGCTGCT CTAGAAGAAA AGAATGTTTT AATTCAAGAA 1500 

TCAGAAACTT TCAGAAAGAA TCTTGAAGAA TCTTTACATG ATAAGGAAAG ATTAGCAGAA 1560 

GAAATTGAAA AGCTGAGATC TGAACTTGAC CAATTGAAAA TGAGAACTGG CTCTTTAATT 1620 

GAACCCACAA TACCAAGAAC TCATCTAGAC ACCTCAGCTG AGTTGCGGTA CTCAGTGGGA 1680 

TCCCTAGTGG ACAGCCAGTC TGATTACAGA ACAACTAAAG TAATAAGAAG ACCAAGGAGA 1740 

GGCCGCATGG GTGTGCGAAG AGATGAGCCA AAGGTGAAAT CTCTTGGGGA TCACGAGTGG 1800 

AATAGAACTC AACAGATTGG AGTACTAAGC AGCCACCCTT TTGAAAGTGA CACTGAAATG I860 

TCTGATATTG ATGATGATGA CAGAGAAACA ATTTTTAGCT CAATGGATCT TCTCTCTCCA 1920 

AGTGGTCATT CCGATGCCCA GACGCTAGCC ATGATGCTTC AGGAACAATT GGATGCCATC 1980 

AACAAAGAAA TCAGGCTAAT TCAGGAAGAA AAAGAATCTA CAGAGTTGCG TGCTGAAGAA 2040 

ATTGAAAATA GAGTGGCTAG TGTGAGCCTC GAAGGCCTGA ATTTGGCAAG GGTCCACCCA 2100 

GGTACCTCCA TTACTGCCTC TGTTACAGCT TCATCGCTGG CCAGTTCATC TCCCCCCAGT 2160 

GGACACTCAA CTCCAAAGCT CACCCCTCGA AGCCCTGCCA GGGAAATGGA TCGGATGGGA 2220 

GTCATGACAC TGCCAAGTGA TCTGAGGAAA CATCGGAGAA AGATTGCAGT TGTGGAAGAA 2280 

GATGGTCGAG AGGACAAAGC AACAATTAAA TGTGAAACTT CTCCTCCTCC TACCCCTAGA 2340 

GCCCTCAGAA TGACTCACAC TCTCCCTTCT TCCTACCACA ATGATGCTCG AAGTAGTTTA 2400 

TCTGTCTCTC TTGAGCCAGA AAGCCTCGGG CTTGGTAGTG CCAACAGCAG CCAAGACTCT 2460 

CTTCACAAAG CCCCCAAGAA GAAAGGAATC AAGTCTTCAA TAGGACGTTT GTTTGGTAAA 2520 

AAAGAAAAAG CTCGACTTGG GCAGCTCCGA GGCTTTATGG AGACTGAAGC TGCAGCTCAG 2580 

GAGTCCCTGG GGTTAGGCAA ACTCGGAACT CAAGCTGAGA AGGATCGAAG ACTAAAGAAA 2640 

AAGCATGAAC TTCTTGAAGA AGCTCGGAGA AAGGGATTAC CTTTTGCCCA GTGGGATGGG 2700 

CCAACTGTGG TCGCATGGCT AGAGCTTTGG TTGGGAATGC CTGCGTGGTA CGTGGCAGCC 2760 

TGCCGAGCCA ACGTGAAGAG TGGTGCCATC ATGTCTGCTT TATCTGACAC TGAGATCCAG 2820 

AGAGAAATTG GAATCAGCAA TCCACTGCAT CGCTTAAAAC TTCGATTAGC AATCCAGGAG 2880 

ATGGTTTCCC TAACAAGTCC TTCAGCTCCT CCAACATCTC GAACTCCTTC AGGCAACGTT 2940 

TGGGTGACTC ATGAAGAAAT GG AAAATCTT GCAGCTCCAG CAAAAACGAA AGAATCTGAG 3000 

GAAGGAAGCT GGGCCCAGTG TCCGGTTTTT CTACAGACCC TGGCTTATGG AGATATGAAT 3060 

CATGAGTGGA TTGGAAATGA ATGGCTTCCC AGCTTGGGGT TACCTCAGTA CAGAAGTTAC 3120 

TTTATGGAAT GCTTGGTAGA TGCAAGAATG TTAGATCACC TAACAAAAAA AGATCTCCGT 3180 

GTCCATTTAA AAATGGTGGA TAGTTTCCAT CGAACAAGTT TACAATATGG AATTATGTGC 3240 

TTAAAGAGGT TGAATTATGA CAGAAAAGAA CTAGAAAGAA GACGGGAAGC AAGCCAACAT 3300 

GAAATAAAAG ACGTGTTGGT GTGGAGCAAT GACCGAATTA TTCGCTGGAT ACAAGCAATT 3360 

GGACTTCGAG AATATGCAAA TAATATACTT GAGAGCGGTG TGCATGGCTC ACTTATAGCC 3420 

CTGGATGAAA ACTTTGACTA CAGCAGCTTA ACTTTATTAT TACAGATTCC AACACAGAAC 3480 

ACCCAGGCAA GGCAGATTCT TGAAAGAGAA TACAATAACC TCTTGGCCCT GGGAACTGAA 3540 

AGGCGACTGG ATGAAAGTGA TGACAAGAAC TTCAGACGTG GATCAACCTG GAGAAGGCAG 3600 

TTTCCTCCTC GTGAAGTACA TGGAATCAGC ATGATGCCTG GGTCCTCAGA AACATTACCA 3660 

GCTGGATTTA GGTTAACCAC AACCTCTGGG CAATCAAGAA AAATGACAAC AGATGTTGCT 3720 

TCATCAAGAC TGCAGAGGTT AGACAACTCC ACTGTTCGCA CATACTCATG TCTCGAGTAA 3780 
GCGGCCGCTT TAA 



SEQ ID NO:238 PM28 Protein sequence: 
Protein Accession*: none found 



1 11 21 31 41 51 

! 1 1 I t I 

MMCEVMPTIN EDTPMSQRGS QSSGSDSDSH FEQLMVNMLD ERDRLLDTLR ETQESLSLAQ 60 

QRLQDVIYDR DSLQRQLNSA LPQDIESLTG GLAGSKGADP PEFAALTKEL NACREQLLEK 120 

EEEISELKAE RNNTRLLLEH LECLVSRHER SLRMTWKRQ AOSPSGVSSE VEVLKALKSL 180 

FEHHKALDEK VRERLRVSLE RVSALEEELA AANQEIVALR EQNVHIQRKM ASSEGSTESE 240 

HLEGMEPGQK VHEKRLSNGS IDSTDETSQI VELQELLEKQ NYEMAQMKER LAALSSRVGE 300 

VEQEAETARK DLIKTEEMNT KYQRDIREAM AQKEDMEERI TTLEKRYLSA QRESTSIHDM 360 

NDKLENELAN KEAILRQMEE KNRQLQERLE LAEQKLQQTM RKAETLPEVE AELAQRIAAL 420 

TKAEERHGNI EERMRHLEGQ LEEKNQELQR ARQREKMNEE HNKRLSDTVD RLLTESNERL 480 

QLHLKERMAA LEEKNVLIQE SETFRKNLEE SLHDKERLAE EIEKLRSELD QLKMRTGSLI 540 

EPTIPRTHLD TSAELRYSVG SLVDSQSDYR TTKVIRRPRR GRMGVRRDEP KVKSLGDHEW 600 

NRTQQIGVLS SHPFESDTEM SDIDDDDRET IFSSMDLLSP SGHSDAQTLA MMLQEQLDAI 660 

NKEIRLIQEE KESTELRAEE IENRVASVSL EGLNLARVHP GTS I TASVTA SSLASSSPPS 720 

GHSTPKLTPR SPAREMDRMG VMTLPSDLRK HRRKIAWEE DGREDKAT IK CETSPPPTPR 780 

ALRMTHTLPS SYHNDARSSL SVSLEPESLG LGSANSSQDS LHKAPKKKGI KSSIGRLFGK 840 

KEKARLGQLR GFMETEAAAQ ESLGLGKLGT QAEKDRRLKK KHELLEEARR KGLPFAQWDG 900 

PTWAWLELW LGMPAWYVAA CRANVKSGAI MSALSDTEIQ REIGISNPLH RLKLRLAIQE 960 

MVSLTSPSAP PTSRTPSGNV WVTHEEMEML AAPAKTKESE EGSWAQCPVF LQTLAYGDMN 1020 

HEWIGNEWLP SLGLPQYRSY FMECLVDARM LDHLTKKDLR VHLKMVDSFH RTSLQYGIMC 1080 

LKRLNYDRKE LERRREASQH EIKDVLVWSK DRIIRWIQAI GLREYANNIL ESGVHGSLIA 1140 

LDENFDYSSL TLLLQIPTQN TQARQILERE YNNLLALGTE RRLDESDDKN FRRGSTWRRQ 1200 
FPPREVHGIS MMPGSSETLP AGFRLTTTSG QSRKMTTDVA SSRLQRLDNS TVRTYSCLE 

SEQ 10 NO:239 PCI4 DNA SEQUENCE 

Nucleic Acid Accession #: NM_01 6570 

Coding sequence: 1- 1 1 34 (underlined sequences correspond to start and stop codons) 
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1 11 21 31 41 51 

I I i t I I 

ATGAGGCGAC TGAATCGGAA AAAAACTTTA AGTTTGGTAA AAGAGTTGGA TGCCTTTCCG 60 

5 AAGGTTCCTG AGAGCTATGT AGAGACTTCA GCCAGTGGAG GTACAGTTTC TCTAATAGCA 120 

TTTACAACTA TGGCTTTATT AACCATAATG GAATTCTCAG TATATCAAGA TACATGGATG 180 

AAGTATGAAT ACGAAGTAGA CAAGGATTTT TCTAGCAAAT TAAGAATTAA TATAGATATT 240 

ACTGTTGCCA TGAAGTGTCA ATATGTTGGA GCGGATGTAT TGGATTTAGC AGAAACAATG 300 

GTTGCATCTG CAGATGGTTT AGTTTATGAA CCAACAGTAT TTGATCTTTC ACCACAGCAG 360 

10 AAAGAGTGGC AGAGGATGCT GCAGCTGATT CAGAGTAGGC TACAAGAAGA GCATTCACTT 420 

CAAGATGTGA TATTTAAAAG TGCTTTTAAA AGTACATCAA CAGCTCTTCC ACCAAGAGAA 480 

GATGATTCAT CACAGTCTCC AAATGCATGC AGAATTCATG GC CATC TATA TGTCAATAAA 540 

GTAGCAGGGA ATTTTCACAT AACAGTGGGC AAGGCAATTC CACATCCTCG TGGTCATGCA 600 

CATTTGGCAG CACTTGTCAA CCATGAATCT TACAATTTTT CTCATAGAAT AGATCATTTG 660 

15 TCTTTTGGAG AGCTTGTTCC AGCAATTATT AATCCTTTAG ATGGAACTGA AAAAATTGCT 720 

ATAGATCACA ACCAGATGTT CCAATATTTT ATTACAGTTG TGCCAACAAA ACTACATACA 780 

TATAAAATAT CAGCAGACAC CCATCAGTTT TCTGTGACAG AAAGGGAACG TATCATTAAC 840 

CATGCTGCAG GCAGCCATGG AGTCTCTGGG ATATTTATGA AATATGATCT CAGTTCTCTT 900 

ATGGTGACAG TTACTGAGGA GCACATGCCA TTCTGGCAGT TTTTTGTAAG ACTCTGTGGT 960 

20 ATTGTTGGAG GAATCTTTTC AACAACAGGC ATGTTACATG GAATTGGAAA ATTTATAGTT 1020 

GAAATAATTT GCTGTCGTTT CAGACTTGGA TCCTATAAAC CTGTCAATTC TGTTCCTTTT 1080 

GAGGATGGCC ACACAGACAA CCACTTACCT CTTTTAGAAA ATAATACACA TTGA 

25 SEQ ID NO:240 PCI4 Protein sequence: 

Protein Accession*: NPJJ57654 

1 11 21 31 41 51 

an 1 f I i i I 

JKJ MRRLNRKKTL SLVKELDAFP KVPESYVETS ASGGTVSLIA FTTMALLTIM EFSVYQDTWM 60 

KYEYEVDKDF SSKLRINIDI TVAMKCQWG ADVLDLAETM VASADGLVYE PTVFDLSPQQ 120 

KEWQRMLQLI QSRLQEEHSL QDVIFKSAFK STSTALPPRE DDSSQSPNAC RIHGHLYVNK 180 

VAGNFHITVG KAIPHPRGHA HLAALVNHES YNFSHRIDHL SFGELVPAII NPLDGTEKIA 240 

IDHNQMFQYF ITWPTKLHT YKISADTHQF SVTERERI IN HAAGSHGVSG IFMKYDLSSL 300 
35 MVTVTEEHMP FWQFFVRLCG IVGGIFSTTG MLHGIGKFIV EIICCRFRLG SYKPVNSVPF 360 

EDGHTDNHLP LDENNTH 

SEQ ID NO:241 P6A7 DNA SEQUENCE 

Nucleic Acid Accession*: AA219134 

40 Coding sequence: 24-1 St 5 (underlined sequences correspond to start and stop codons) 

AATTCGCCCT TGCTTAATTA AGCATGTTTA CCTTCCTGTC ATCTGTCACT GCTGCTGTCA 60 

GTGGCCTCCT GGTGGGTTAT G A ACTTGGGA TCATCTCTGG GGCJCT1CTT CAGATCA AAA 1 20 
45 CCTTATTAGCCCTGAGCTGCCATGAGCACKjAAATGGTTGTGAGCTCCCTCGTCATTGGAG 180 

CCCTCCTTGC CTCACTCACC GGAGGGGTCC TGATAGACAG ATATGGAAGA AGGACAGCAA 240 

TCATCTTGTC ATCCTGCCTG CTTGGACTCG GAAGCTTAGT CTTGATCCTC AGTTTATCCT 300 

ACACGGTTCT TATAGTGGGA CGCATTGCCA TAGGGGTTTC CATCTCCCTC TCTTCCATTG 360 

CC ACTTGTGT TTAC ATCGC A G AG ATTGCTC CTC AAC AC AG A AG AGGCCTT CTTGTGTC AC 420 
50 TGAATGAGCT GATGATTGTC ATCGGCATTC TTTCTGCCTA TATTTCAAAT TACGCATTTG 480 

CCAATGTTTT CCATGGCTGG AAGTACATGT TTGGTCTTGT GATTCCCTTG GGAGTTTTGC 540 

AAGCAATTGC AATGTATTTT CTTCCTCCAA GCCCTCGGTT TCTGGTG ATG AAAGG ACAAG 600 

AGGGAGCTGC TAGCAAGGTT CTTGGAAGGT TAAG AGCACT CTCAGATACA ACTGAGGAAC 660 

TCACTGTGAT CAAATCCTCC CTGAAAGATG AATATCAGTA CAGTTTTTGG GATCTGTTTC 720 
5 5 GTTCAAAAG A C AACATGCGG ACCCGAATAA TG ATAGG ACT AAC ACTAGTA TTTTTTGTAC 780 

AAATCACTGG CCAACCAAAC ATATTGTTCT ATGCATCAAC TGTTTTGAAG TCAGTTGGAT 840 

TTCAAAGCAA TGAGGCAGCT AGCCTCGCCT CCACTGGGGT TGG AGTCGTC AAGGTCATTA 900 

GCACCATCCC TGCCACTCTT CTTGTAGACC ATGTCGGCAG CAAAACATTC CTCTGCATTG 960 

GCTCCTCTGT GATGGCAGCT TCGTTGGTGA CCATGGGCAT CGTAAATCTC AACATCCACA 1020 
60 TGAACTTCAC CCATATCTGC AGAAGCCACA ATTCTATCAA CCAGTCCTTG GATGAGTCTG 1080 

TGATTTATGG ACCAGGAAAC CTGTCAACCA ACAACAATAC TCTCAGAGAC CACTTCAAAG 1 140 

GGATTTCTTC CCATAGCAGA AGCTCACTCA TGCCCCTGAG AAATGATGTG GATAAGAGAG 1200 

GGGAGACGAC CTCAGCATCC TTGCTAAATG CTGGATTAAG CCACACTGAA TACCAGATAG 1260 

TCACAGACCC TGGGGACGTC CCAGCTTTTT TGAAATGGCT GTCCTTAGCC AGCTTGCTTG 1320 
65 TTTATGTTGC TGCTTTTTCA ATTGGTCTAG GACCAATGCC CTGGCTGGTG CTCAGCGAGA 1380 

TCTTTCCTGG TGGGATCAGA GGACGAGCCA TGGCTTTAAC TTCTAGCATG AACTGGGGCA 1440 

TCAATCTCCT CATCTCGCTG ACATTTTTGA CTGTAACTGA TCTTATTGGC CTGCCATGGG 1500 

TGTGCTTTAT ATATACAATC ATGAGTCTAG ATCTTATTGG CCTGCCATGG GTGTGCTTTA 1560 

TATATACAAT CATGAGTCTA GCATCCCTGC TTTTTGTTGT TATGTTTATA CCTGAGACAA 1620 
70 AGGGATGCTC TTTGGAACAA ATATCAATGG AGCTAGCAAA AGTGAACTAT GTGAAAAACA 1680 

ACATTTGTTT TATGAGTCAT CACCAAGAAG AATTAGTGCC AAAACAGCCT CAAAAAAGAA 1740 

AACCCCAGGA GCAGCTCTTG GAGTGTAACA AGCTGTGTGG TAGGGGCCAA TCCAGGCAGC 1800 

TTTCTCCAGA GACCTAATGG CCTCAACACC TTCTGAACGT GGATAGTGCC AGAACACTTA 1860 

GGAGGGTGTC TTTGG ACCAA TGCATAGTTG CGACTCCTGT GCTCTCTTTT CAGTGTCATG 1 920 
7 5 GAACTGGTTT TG AAG AG ACA CTCTG AAATG ATAA AG ACAG CCTTTAATCC CCCTCCTCMC 1980 

CAGAAGGAAC CTCAAAAGCT AGATGAGGTA CAAGGTCCTA AGTGATCTCT TTTTCTGAGC 2040 

AGGATATCAG GTTAAAAAAA AAAAGTTACT GGCTGGTTTA ATACTTTCTA CCTTCTTCAC 2100 

AGAGCAGCCT TTGAATAG AC TATGTCCTAG TGAAGACATC AACCTCCGCC TTAAGCTATG 2160 

TATGTATGGA GGCCAGTCGC AGCTTTATTA TGCAGACACA CAAGTGGTCT GGACATGAGG 2220 
80 GTACAGTTTC TGCCT ACCAA GACACTACTT GCACTGGATC TTACGCAAAA AAGAACCAGA 2280 

402 
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ACACACAGTG TGGACAACTG CCCATATATT CTATCTAGAT TAGGAGAGGG TCCTGGCTAG 2340 
GATTTTAGTG GTAATTCCTA GTTACATTCA ACAAGTATAA AGATTATAGA GCTTATTTTA 2400 
TGAACTATAA ACTATAATTT A ATGC A A A AT ATCCTTTTAT GAATTTCATG TTAATATTGT 2460 
G AAATATTAA AATAATTCCR CAATAGTTG A GAAAAATG AG CATTTTTTTC CATTTTTAAA 2520 
5 AAATGCATAG AAAAGACAAT TTTAAAATCC TGGGACCATA TTTATTTAGA AGTAGCTGTT 2580 
AGTAAAACAT TAGAAAAGG A GTCAGGCCAT TAGGTTATTT ATCCAAATCT CTAAGCAATT 2640 
AGGTTG AAGT TATTA AGTCA AGCCTAGA AA AGCTGCCTCC TTGTAAGGCT TTCATG ACAA 2700 
TGTATAGTAA TCCACAGTGT CCAATTCTTC ACACTCCTCA GGAATATCAC TACCTCAGGT 2760 
TACGGTACAC AGGCTATAAT TGATGATGAT GTTCAGATAA CTGAAGACAC AATAAATGAC 2820 

1 0 ATTC AG AC AT C AGG AM A AW W CCCTCATGTT CTTTTCTATG ATGGCCACCT GTACCAGCAA 2880 
CGTGGGTTTC ACCCACACAA CGATGAACTG TTCTCTTACT TCTCCAGTTG ATTTTAAAGA 2940 
CTTGTTAAGA GGTCTTACTA ATAAAATTTG GGTATGATAG AAAAWCCACA ATCAAAWCTT 3000 
GAACCAAATA ACATATTAAA TTACTAATAT TTAAGTGATG GAAGACACAC AAAAAACTTA 3060 
AAAGCACG A A CA ACCTA ACT TG A AA A AG AA TTTTAAAATA TGATTAACCT GA AGAAAAG A 3 1 20 

1 5 G AATCCTAAG AGCCAA AGCT CCTTTTTATT TAGCTTGG A A TTTTCCTATT GGTTCCTA AC 3 1 80 
AAACTGTCCC AATGTCATAT AAGGAAACAT GATCTATTAC ATTCCTTTAT AACAATGTGG 3240 
AGAGACTATA AACCTATGTA AGTAGTAAAA CTATATYAGA GACTCAGGAG ACTGACTAAA 3300 
AGGCCTGG AT CTGC AGTGTA TTATCTGT AT AAAAATTGGC AGGGGGA AGC TAAAAGGA A A 3360 
GGAGATTGGA GATCTCAATT CTATCATGGT GTATTTCATA CGCAAATCAG AGCATGCATT 3420 

20 GTTTTTTGTT TTTGGAAAG A GAAGGGAAGT GTGTTCTGCC CCATGTTTCC TTCCGTGTTT 3480 
ATAGTTCAAA CTCTATATAT ACTTCAGGTA TTTrTTGTTT AGCCCTTCAT TATAAATGGG 3540 
CAGGAAATTG TTTATCAACC TAGCCAGTTT ATTACTAGTG ACCTTGACTT CAGTA TCTTG 3600 
AGCATTCTTT TATATTTTTC TTTTATTATC CTGAGTCTGT AACTAAACAA TTTTGTCTTC 3660 
A AATTTTTAT CCAATATCCA TTGCACCACA CCAA ATCAAG CTTCTTG ATT TTCAAAAATA 3720 

25 AAAAGGGGGA AATACTTACA ACTTGTACAT ATATATTCAC AG IT 1 11 ATT TATAAAAAAA 3780 
ATTTACAGTA CTTATGGAGA GCCAGCAGAA GACATCAGAG CACTCACTTC TTCCCATCTT 3840 
TGTTAAGGTT AGCGAATTAC CCATGGACAC TGTTAGGTGA GGCTCATTCG GCAGCCCTG A 3900 
AAACAAACCT GGTCACACTG TCTTTACCCT CTCCCTTC AG ATAAAGCACT TCGATTATCT 3960 
ATTGATCTGC CCAGTTTTC A AGTCATGCG A ATACTAAAAA GGTTAC ATCA TCTGGATCTG 4020 

30 TACCTTGGCT ATATAAGCAT GTTTTCCCCC TATTCTATGT TTCTTTTTTT GGTGAACATT 4080 

GAAAAACAGG AGGTGACTTA TTACTGTTAA TTAAAACTAA ATGAAAAATG TCAAGTCTTT 4140 
AAAACAGTGA GCTTGTAACT CTTTCATGTA ATTTTATTCT CT ATG AATTT GGCT ATCCT A 4200 
CTGAATCTTA AAATAAAGGA AATAAACACT TTTTTTTWAA AAAAAAGGAA AAATAMAARW 4260 
MWAAAAATCT CAATGAAATA TTTCACAAGA AGGAAAAA 

35 

Protein Accession #: AAF91 431 

40 MFTFLSSVTA AVSGLLVGYE LG1ISGALLQ 1KTLLALSCH EQEMVVSSLV IGALLASLTG 60 
GVLIDRYGRR TAIILSSCLL GLGSLVULS LSYTVLIVGR IAIGVSISLS SIATCVYIAE 120 
IAPQHRRGLL VSLNELMIVI GILS AYISNY AFANVFHGWK YMFGLVIPLG VLQAIAMYFL 1 80 
PPSPRFLVMK GQEGAASKVL GRLRALSDTT EELTVIKSSL KDEYQYSFWD LFRSKDNMRT 240 
RIMIGLTLVF FVQITGQPNI LFYASTVLKS VGFQSNEAAS LASTGVGVVK VISTIPATLL 300 
45 VDHVGSKTFL CIGSSVMAAS LVTMGIVNLN IHMNFTH1CR SHNSINQSLD ESVIYGPGNL 360 
STNNNTLRDH FKGISSHSRS SLMPLRNDVD KRGETTSASL LNAGLSHTEY QIVTDPGDVP 420 
AFLKWLSLAS LLVYVAAFSI GLGPMPWLVL SEIFPGGIRG RAM ALTS SMN WGINLLISLT 480 
FLTVTDLIGL PWVCFIYTIM SLDUGLPWV CFIYTTMSLA SLLFVVMFTP ETKGCSLEQI 540 
SMELAKVNYV KNNICFMSHH QEELVPKQPQ KRKPQEQLLE CNKLCGRGQS RQLSPET 



50 



55 



SEQ ID NO:242 PBA7 Protein sequence ; 



SEQ ID NO:243 PAB4 DNA sequence: 
Nucleic Acid Accession!: AA172056 

Coding sequence: 121-339 {underlined sequences correspond to start and stop codons) 



TTTAGCCACC AGAGGANTTC TCTTG AAATA CCCAAAATCC ATCAGTATCT TGAATCATGC 60 
TGGATTTTGA AGAATTCTTA AGAAGCCATG TAAAGGGGGC TCTCTGGCCT TGAAATAGTG 120 

£ r. ATQTTTTTTA TACAGAAAGG AGAATGCAGA ATGGTCAGAC TATCATGCAC TGTTAAATTT 180 

60 GATTTCAAGA AATTACAGGA AAACTTTCCA AAGTTCCATC TCACAGAANN TTATTTTNCC 240 
AAGAATTCCA AGATAAGTTT AGTTTTATGG AAGACTTTTA TGTGGTTTTT ACTCACTCTT 300 
CATCTCAGAC ATCGACAGAT G ATTACATCA CTTATAQTTC TAGTAAATTT ATTAATATAA 360 
AACTCAGAGA CATTCCAATA TCCACATTGC TTACACCATT AGGCATAGAT TCAGTGTCAG 420 
CTATG AC A AT TG AAA ATG AG CTGTTTTGTG ATTTA A AGGT TTAAATTTCT CTAACCAAAC 480 

65 TGCTTGATCC AG ATGC AGG A CTGCAAATGT TAATATTTGT TCTGGAAGAA CAATCAAATA 540 
AGACTTAAGA GGAAAGGG AA TGGCCACAAT CCACCTGAAA TTTTTTCTTA AAAAGTGTGC 600 
AGCCTACTAA ATCAGAATGA AAATAGAAGT ACAAGATTAT AAACAAAATG CAATCAAACT 660 
TTTCTTAAGC TTACCTAAAG TTATTTCATC TGAAAATTTC AAGCAACTTT GTTCAACATT 720 
AAATTGACAA TCTAAACTAA CAAGTCTTTT GAATTTATGC ATGGTAGTAA ACATTCTCTC 780 

70 TATTAACTTT ATTACCTAAG GCTAAACCTA AAATTTTTAA GCAAAATTAG AAAAATAGTC 840 
TTCACTCATC AAAAAATAAA GTTTGTTACA TTTAGTATTT TCCCAATAAA ATTGGTCGTT 900 
CTTGGTTTTT TATTTGGAGA GTCTGTGCAA AATGTCACTA AAAATAAATT AGC ACT AG A A 960 
ATTATTTCTA AATACCAAA 

75 SEQ ID NO:244 PBQ8 DNA SEQUENCE 

Nucleic Acid Accession*: X51405 

Coding sequence: 3-1 721 (underlined sequence corresponds to start and stop codon) 

l 11 21 31 41 51 

403 
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10 



15 



20 



25 



30 



35 



40 



AAATGGCGTG 
CCTGGGCTCC 
GTGGCCCCAG 
GGTGCGGAAC 
AAGAGGCCGC 
GAGGGGGCAG 
GCGCCGAAGC 
AGCAAGAGGA 
TGTCCGTGTG 
AGGGCCGGGA 
AGCCTGAATT 
TCATTTTCTT 
ACCTGATCCA 
AGGCAGCGTC 
GAATAGATCT 
AAGGTGGTCC 
AGCTTGCTCC 
CTGCCAATCT 
GTAGTGCTCA 
CATACTCTTC 
ATGATGACAG 
GAGGGATGCA 
GCTGTGAGAA 
CCCTCATTAG 
AAGGTAACCC 
CCGCAAAGGA 
CAGCTCCAGG 
GGGTTGATTT 
TGGAATGGTG 
CTTTAAATCT 
CAGTTAATAC 
AAATAAATAG 
TATTCATTTT 
ATCCTAGGCT 
TCTAGCTTTC 
AATGCTATTG 
TAAATAGTTC 
TGTTAATGCA 
AATAAAAATT 
TTAACACTAC 
CTGAATGAAT 



CCCGTCTCTC 
GCGGCCAGTA 
TGCGCGGGCT 
TTGCCGCCCC 
CCGCGTAGGA 
CGCGCTGCTG 
CCAGGAGCCC 
CGGCATCTCC 
GCTGCAGTGC 
GCTCCTGGTC 
TAAATACATT 
GGCCCAGTAC 
CAGTACCCGC 
TCAGCCTGGT 
GAACCGGAAC 
AAATAATCAT 
TGAGACCAAG 
CCATGGAGGA 
CGAATACAGC 
TTTCAACCCG 
CAGCTTTGTA 
AGACTTCAAT 
GTTCCCACCT 
CTACCTTGAG 
AATTGCGAAT 
TGGTGATTAC 
CTATCTGGCA 
TGAACTGGAG 
GAAAATGATG 
ATCTATATAA 
TTAACATTGA 
CCTCTTAGGT 
CCTACCTATA 
TAAATGCAAT 
AAAAATTAGT 
AAAAGGTTAA 
AGTATAAATT 
TTTTTGATGG 
GACTTCTTGC 
TTAAAAGTTT 
AAAGGTTAAA 



CGCCGGCCCC 
GTGCAGCCCG 
GACACTCATT 
CAGCAGCGCC 
AGGCACGGCC 
GCTCTGTGCG 
GGGGCGCCCG 
TTCGAGTACC 
ACCGCCATCA 
ATCGAGCTGT 
GGGAATATGC 
CTATGCAACG 
ATTCACATCA 
GAACTCAAGG 
TTTCCAGACC 
CTGTTGAAAA 
GCTGTCATTC 
GACCTTGTGG 
TCCTCCCCAG 
GCCATGTCTG 
GATGGAACCA 
TACCTTAGCA 
GAAGAGACTC 
CAGATACACC 
GCCACCATCT 
TGGAGATTGC 
ATAACAAAGA 
TCATTTTCTG 
TCAGAAACTT 
TGTAGTATGA 
TTTATTTTTT 
AAAAATATAA 
TTACACAAAA 
ATTCCTGGTA 
GAAGTTCTTT 
CAGATACAGC 
GTCGTTTTTT 
GAAGAAAAGG 
TTGTACATAT 
AGGGTTTTCT 
AAAAAATCCC 



CTGCCTCGCA 
TGGAGCCGCG 
CAGCCGGGGA 
GGCGGGCTAA 
GGCGGCGGCG 
GGGCACTGGC 
CGGCGGGCAT 
ACCGCTACCC 
GCAGGATTTA 
CCGACAACCC 
ATGGGAATGA 
AATACCAGAA 
TGCCTTCCCT 
ACTGGTTTGT 
TGGATAGGAT 
ATATGAAGAA 
ATTGGATTAT 
CCAATTATCC 
ATGACGCCAT 
ACCCCAATCG 
CCAACGGTGG 
GCAACTGTTT 
TGAAGACCTA 
GAGGAGTTAA 
CCGTGGAAGG 
TTATACCTGG 
AAGTGGCAGT 
AAAGGAAAGA 
TAAATTTTTA 
TGTAATGTGG 
AATCATTTAA 
GAACTTGATA 
AAGTATAGAA 
TTATTTACAA 
TACTGTAATT 
TCGGAGTTGT 
TCTTGTGCTG 
TACATGTTTA 
AGGAGCAATA 
CTTGGTTGTA 
CAGTGAAAAA 



GTGGTTTCTC 
GCTTTGCCCG 
AGGTGAGGCG 
GCCCAGGGCC 
GAGCGCAGCG 
TGCCTGCGGG 
GAGGCGGCGC 
CGAGCTGCGC 
CACGGTGGGG 
TGGCGTCCAT 
GGCTGTTGGA 
GGGGAACGAG 
GAACCCAGAT 
GGGTCGAAGC 
AGTGTACGTG 
AATTGTGGAT 
GGATATTCCT 
ATATGATGAG 
TTTCCAAAGC 
GCCACCATGT 
TGCTTGGTAC 
TGAGATCACC 
CTGGGAGGAT 
AGGATTTGTC 
AATAGACCAC 
AAACTATAAA 
TCCTTACAGC 
AGAGGAGAAG 
AAAAGGCTTC 
TCTTTTTTTT 
ATATTAATCA 
TATTTCATTC 
AAGATTTAAG 
TGCAGAATTT 
GGTGACAATG 
GAGCACTCTA 
ACTAACTATA 
CAAAGAGGTT 
CTATTATATT 
GAGTGGCCCA 
AAA 



CTGCAGCTCC 
TCTCCTCTGG 
AGTAGAGGCT 
GGGCAGACAA 
ATGGCCGGGC 
TGGCTCCTGG 
CGGCGGCTGC 
GAGGCGCTCG 
CGCAGCTTCG 
GAGCCTGGTG 
CGAGAACTGC 
ACAATTGTCA 
GGCTTTGAGA 
AATGCCCAGG 
AATGAGAAAG 
CAAAACACAA 
TTTGTGCTTT 
ACGCGGAGTG 
TTGGCCCGGG 
CGCAAGAATG 
AGCGTACCTG 
GTGGAGCTTA 
AACAAAAACT 
CGAGACCTTC 
GATGTTACAT 
CTTACAGCCT 
CCTGCTGCTG 
GAAGAATTGA 
TAGTTAGCTG 
AGATTTTGTG 
ACTTTCCTTA 
TCTTATATAG 
TAATTTTGCC 
TTTGAGTAAT 
TCACATAATG 
CTGCAAGACT 
AGCATGATCT 
TTATGAAAAG 
ATGTAGTCCG 
GAATTGCATT 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
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Protein Accession*: 



SEQ ID NO:245 PBQ8 Protein sequence 
P16870 



50 



55 



MAGRGGSALL ALCGALAACG WLLGAEAQEP GAPAAGMRRK RRLQQEDGIS FEYHRYPELR 60 
EALVSVWLQC TAISRIYTVG RSFEGRELLV ELSDNPGVH EPGEPEFKYI GNMHGNEAVG 120 
RELLIFLAQY LCNEYQKGNE TIVNL1HSTR IHIMPSLNPD GFEKAASQPG ELKDWFVGRS 1 80 
NAQGIDLNRN FPDLDRIVYV NEKEGGPNNH LLKNMKKIVD QNTKLAPETK AVIHWIMDIP 240 
FVLSANLHGG DLVANYPYDE TRSGSAHEYS SSPDDAIFQS LARAYSSFNP AMSDPNRPPC 300 
RKNDDDSSFV DGTTNGGAWY SVPGGMQDFN YLSSNCFEIT VELSCEKFPP EETLKTYWED 360 
NKNSL1SYLE QIHRGVKGFV RDLQGNPIAN ATISVEGIDH DVTSAKDGDY WRLLIPGNYK 420 
LTASAPGYLA ITKKVAVPYS PAAGVDFELE SFSERKEEEK EELMEWWKMM SETLNF 



60 
65 
70 
75 
80 



$e q ID NQ:246 PgY4 PNA seque n ce 
Nucleic Acid Accession*: AF038966 

Coding sequence: 



91-1 1 07 (underlined sequence corresponds to start and stop codon) 



GGGGCGACGT 
GTCGGGTGGG 
GACCCGGATC 
CCACCAGGAC 
GTGAAGATGC 
CCAGCTTATA 
CAAGAAGAAC 
CTCAGTCAAC 
CCTTGTTTCT 
CTTATGTACT 
TTGGCTTGGT 
TTCTTGCTTT 
AGGAGTGACA 
GTACATGTAC 
CTTACTGGTC 
TTCACAGCAT 
ACAACAGGTG 
AAAACTGTCC 



11 

I 

GAGCGCGCAG 
TGACGCCGAG 
TCAACAATCC 
TTGATGAATA 
CTAATGTACC 
CACAGATTGC 
TAGAAAGAAA 
ATGGTAGAAA 
ATCAGGAATT 
ACTTGTGGAT 
TTTGTGTTGA 
TTACTCCTTG 
GTTCATTTAG 
TCCAAGCTGC 
TCAACCAAAA 
CAGCAGTCAT 
CTAGTTTTGA 
AGACCGCAGC 



21 

I 

GGGGGCGGCG 
AGCCAGAGAG 
CTTCAAGGAT 
TAATCCATTC 
CAATACACAA 
AAAGGAACAT 
AGCCGCAGAA 
AAATATTTGG 
TTCTGTAGAC 
GTTCCATGCA 
TTCTGCAAGA 
TTCATTTGTC 
ATTCTTTGTA 
AGGATTTCAT 
TATTCCTGTT 
CTCACTAGTT 
GAAGGCCCAA 
TGCAAATGCA 



31 

S 

GCCTCGCCTC 
ATG TCGGATT 
CCATCAGTTA 
TCGGATTCTA 
CCAGCAATAA 
GCATTGGCCC 
TTAGATCGTC 
CCACCTCTTC 
ATTCCTGTAG 
GTAACACTGT 
GCGGTTGATT 
TGTTGGTACA 
TTCTTCTTCG 
AACTGGGGCA 
GGAATCATGA 
ATGTTCAAAA 
CAGGAGTTTG 
GCTTCAACTG 



41 

1 

GTCTCTCTCT 
TCGACAGTAA 
CACAAGTGAC 
GAACACCTCC 
TGAAACCAAC 
AAGCTGAACT 
GGGAACGAGA 
CTAGCAATTT 
AATTCCAAAA 
TTCTAAATAT 
TTGGATTGAG 
GACCACTTTA 
TCTATATTTG 
ATTGTGGTTG 
TGATAATCAT 
AAGTACATGG 
CAACAGGTGT 
CAGCATCTAG 



51 

1 

CTGCGCCTGG 
CCCGTTTGCC 
AAGAAATGTT 
ACCAGGCGGT 
AGAGGAACAT 
TCTTAAGCGC 
AATGCAAAAC 
TCCTGTCGGA 
GACAGTAAAG 
CTTCGGATGC 
TATCCTGTGG 
TGGAGCTTTC 
TCAGTTTGCT 
GATTTCATCC 
AGCAGCACTT 
ACTATATCGC 
GATGTCCAAC 
TGCAGCTCAG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
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AATGCTTTCA AGGGTAACCA GATTTAAGAA 
TGTACCTTTT TCTCCAGTTA CTGTATTCTA 
CAGACAGCAT GGATATTTCC TGTTCACTTG 
GTCTTATTAC TTTACCTAAT AGTTTCTTAA 
ACATGCTAAA TAAATATTCT CCATATTTTT 
GGTGACCCAC TGAAAATTAA TAATGGTACT 
CAGTAGTTCT TTCAAGAATC TTTAGAGATA 
TTCATTCCTT TTTCCCTATT TATATTGAAA 
AAATTGGCTT GCTTTTTAGC TGTTTCAGTC 
TAGATAATGT AAAATTTGTC ATCTTTTTCT 
ATAACAATCT CTAATTTGCA TGGGCACCAC 
GCTTCTGTAC TGCTTATGGT TGTAGGATTC 
CTGCAAGAAT TTCTTTTAAA TAAAAAGTTT 
TGCAGTACAT TATCCAAAAG AGAAGGTAGT 
CTTTTT 



TCTTCAAACA ATACACTGTT ACCTTTTGAC 1140 

CAAATATTTT TATGTTCAAA ACACACAGTA 1200 

TGCATGGGCT AAAACCAGGA AAACTTCCTT 1260 

TATTTCAGTG CCCCTTGCAG AAAAAATATT 1320 

GGGGGATGAC ATTCAGTGAA TTATTTCAGT 1380 

TATGATTAAA AACGCATTTA ATACTAACTG 1440 

AGGATTGCAC ATTGGAAAAG TAAACCATGT 1500 

GAAATAGGCC AGCAGAGACT TAGGGATTTT 1560 

ACCAGTGAAG AGCCTATGTG CATTTTGTAG 1620 

TTTCTTTTTT TTAGAATAGC TGATATTTTG 1680 

ATTTCTTATA TTAAAAGAAT TAGTGTTTTG 1740 

AGGGGTTAAT GGAATCACAG AAATGATATT 1800 

GGGGGTGCAA TATAAGAAGT TTATATAATA I860 

TAATGCAGTA GAAAGTAGTG GTAATAATTC 1920 



SEQ ID NO: 247 PBY4 Protein sequence: 

Protein Accession #; 

MSDFDSNPFA DPDLNNPFKD PS VTQVTRNV PPGLDEYNPF SDSRTPPPGG VKMPNVPNTQ 60 
PAIMKPTEEH PAYTQfAKEH ALAQAELLKR QEELERKA AE LDRREREMQN LSQHGRKNIW 1 20 
PPLPSNFPVG PCFYQEFSVD IPVEFQKTVK LMYYLWMFHA VTLFLNIFGC LAWFCVDSAR 180 
AVDFGLSILW FLLFTPCSFV CW YRPLYGAF RSDSSFRFFV FFFVYICQFA VHVLQ A AGFH 240 
NWGNCGWISS LTGLNQNIPV GIMMIIIAAL FTASAVISLV MFKKVHGLYR TTGASFEKAQ 300 
QEFATGVMSN KTVQTAAANA ASTAASSAAQ NAFKGNQI 



SEQ ID NO:248 PBH2 DNA sequence 

Nucleic Acid Accession*: none found 

Coding sequence: 1 -61 3 (underlined sequence corresponds to start and stop codon) 



ATGAGAGACA ATAAATCGTG TGCTTTTTTC ATGGG AAAGT TAAATGTTTG TTTTGAAGGC 60 
ACAGTAATAG CAGGCTATTC AGTGTTTGCC ACTACCTGCA TCATTCATCT GGCTGTAGCT 120 
AGTGCACTAC AATTTCCT A A AAAGTCTTCT CACCCTCACA GG ACTGCTCT AC ATCTGGCC 1 80 
TCTGCCAATG GAAATTCAGA AGTAGTAAAA CTCCTGCTGG ACAGACGATG TCAACTTAAT 240 
ATCCTTGACA ACAAAAAGAG GACAGCTCTG ACAAAGGCCG TACAATGCCA GGAAGATGAA 300 
TGTGCGTTAA TGTTGCTGGA ACATGGCACT GATCCGAATA TTCCAGATGA GTATGGAAAT 360 
ACCGCTCTAC ACTATGCTAT CTACAATGAA GATAAATTAA TGGCCAAAGC ACTGCTCTTA 420 
T ACGGTGCTG ATATCG A ATC AA AAA ACA AG CATGGCCTCA CACCACTGTT ACTTGGTGTA 480 
CATGAGCAAA AACAGCAAGT GGTGAAATTT TTAATCAAGA AAAAAGCAAA TTTAAATGCA 540 
CTGGATAGAT ATGG AAGGTG TGTGACCTTG GG A ACGTTAT TTACCACCAA ATATGTTGTC 600 
ATATATGAAA AGTAG 



$EQ ID NQ:249 PPH2 Protein sequence; 
Protein Accession #: none found 

MRDNKSCAFF MGKLNVCFEG TVIAGYS VFA TTCIIHLAVA SALQFPKKSS HPHRTALHLA 60 
SANGNSEVVK LLLDRRCQLN ILDNKKRTAL TKAVQCQEDE CALMLLEHGT DPNIPDEYGN 120 
TALHYAIYNE DKLMAKALLL YGADtESKNK HGLTPLLLGV HEQKQQVVKF LIKKKANLNA 180 
LDRYGRCVTL GTLFTTKY V V IYEK 



$EQ1DNQ:250PBJ1 DNA sequence 
Nucleic Acid Accession*: XM_005829 

Coding sequence: 1-3043 (underlined sequence corresponds to start and stop codon) 

ATGGTGATCA TCTATCTTTC TTTCTGCAAT TATTACATGG AGTTCTACAG AGAAGAGCTT 60 
CCCCACATTG ACTATTTGAT TGACATTCAG TTTGCAACAG G AAAGGTTAC TCAGCCGGGA 120 
GAGG ACACTT CCTACCATCA A TGCGCTCAG CTTGAAGCCA G AG ACG AAGG CACCGAC AGT 1 80 
TTATTATTAA ACAATGGCAG CAGCGCCACG CTGAAGACAC GAACGCGCTG TTATGGAACC 240 
CCCAGAGGTC TCCCCCATCG TAGCCTGCTC CAGCCGACTC CGCCCACATG TAAAACGAAG 300 
ATCAGGAGCA GATTTGAAGA ATTACAAAGT GAATTGGTGC CAGTCAGCAT GTCAGAGACA 360 
GACCACATAG CCTCTACTTC CTCTGATAAA AATGTTGGGA AAACACCTGA ATTAAAGGAA 420 
GACTCATGCA ACTTGTTTTC TGGCAATGAA AGCAGCAAAT TAGAAAATGA GTCCAAACTA 480 
TTGTCATTAA ACACTGATAA AACTTTATGT CAACCTAATG AGCATAATAA TCGAATTGAA 540 
GCCCAGGAAA ATTATATTCC AGATCATGGT GGAGGTGAGG ATTCTTG TGC CAAAACAGAC 600 
ACAGGCTCAG AAAATTCTGA ACAAATAGCT AATTTTCCTA GTGGAAATTT TGCTAAACAT 660 
ATTTCAAAAA CAAATG AAAC AGAACAGAAA GTAACACAAA TATTGGTGGA ATTAAGGTCA 720 
TCTACATTTC CAGAATCAGC TAATGAAAAG ACTTATTCAG AAAGCCCCTA TGATACAGAC 780 
TGCACCAAGA AATTTATTTC AA AAATAAAG AGCGTTTCAG CATCAGAGGA TTTGTTGGAA 840 
GAAATAGAAT CTGAGCTCTT ATCTACGGAG TTTGCAGAAC ATCGAGTACC AAATGGAATG 900 
AATA AGGGAG A ACATGCATT AGTTCTGTTT GAAAAGTGTG TGCAAGATAA ATATTTGCAG 960 
CAGGAACATA TCATAAAAAA GTTAATTAAA GAAAATAAGA AGCATCAGGA GCTCTTCGTA 1020 
GACATTTGTT CAGAAAAAGA CAATTTAAGA G A AG A ACT A A AGAAAAGAAC AGAAACTGAG 1080 
AAGCAGCATA TGAACACAAT TAAACAGTTA GAATCAAGAA TAGAAGAACT TAATAAAGAA 1 140 
GTTAAAGCTT CCAGAGATCA ACTAATAGCT CAAGACGTTA CAGCTAAAAA TGCAGTTCAG 1200 
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CAGTTACACA AAGAGATGGC CCAACGGATG GAACAGGCCA ACAAGAAATG TG A AG AGGCA 1260 
CGCCAAGAAA A AG A AGC A AT GGTAATG A A A TATGTAAGAG GTGAGAAGGA ATCTTTAGAT 1320 
CTTCGAAAGG AAAAAGAGAC ACTTGAGAAA AAACTTAGAG ATGCAAATAA GG A ACTTGAG 1380 
AAAAACACTA ACAAAATTAA GCAGCTTTCT CAGGAGAAAG GACGGTTGCA CCAGCTGTAT 1440 
5 GAAACTAAGG AAGGCGAAAC GACTAGACTC ATCAGAGAAA TAGACAAATT AAAGGAAGAC 1500 
ATTAACTCTC ACGTCATCAA AGTAAAGTGG GCACAAAACA AATTAAAAGC TGAAATGGAT 1560 
TCACACAAGG AAACCAAAGA TAAACTCAAA GAAACAACAA CAAAATTAAC ACAAGCAAAG 1620 
GAAGAAGCAG ATCAGATACG AAAAAACTGT CAGGATATGA TAAAAACATA TCAGGAGTCA 1680 
GAAGAAATTA AATCAAATGA GCTTGATGCA AAGCTTAGAG TCACAAAAGG AGAACTTGAA 1740 

1 0 AA ACAAATGC AAG AAAAATC TG ACC AGCTA G AG ATGCATC ATGCCAAAAT AA AGG A ACT A 1 800 
GAAGATCTGA AGAGA ACATT TAAGGAGGGT ATGGATGAGT TAAGAACACT GAGAACAAAG 1860 
GTGAAATGTC TAGAAGATGA ACGATTAAGA ACAGAAGATG AATTATCAAA ATATAAGGAA 1920 
ATTATTAATC GCCAAAAAGC TGAAATTCAG AATTTATTGG ACAAGGTGAA AACTGCAGAT 1980 
CAGCTACAGG AGCAGCTTCA AAGAGGTAAG CAAGAAATTG AAAATTTGAA AGAAGAAGTG 2040 

1 5 GAAAGTCTTA ATTCTTTG AT TAATGACCTA CAAAAAGACA TCGAAGGCAG TAGGAAAAGA 2100 
GAATCTGAGC TGCTGCTGTT TACAGAAAGG CTCACTAGTA AGAATGCACA GCTTCAGTCT 2160 
GAATCCAATT CTTTGCAGTC AC A ATTTG AT AAAGTTTCCT GTAGTG AAAG TCAGTTACAA 2220 
AGCCAGTGTG AACAAATGAA ACAGACAAAT ATTAATTTGG AAAGTAGGTT GTTGA A AGAG 2280 
GAAGAACTGC GAAAAGAGGA AGTCCAAACT CTGCAAGCTG AACTCGCTTG TAGACAAACA 2340 

20 GAAGTTAAAG CATTGAGTAC CCAGGTAGAA GAATTAAAAG ATGAGTTAGT AACTCAGAGA 2400 
CGTAAACATG CCTCTAGTAT CAAGGATCTC ACCAAACAAC TTCAGCAAGC ACGAAGAAAA 2460 
TTAGATCAGG TTGAGAGTGG AAGCTATGAC AAAGAAGTCA GCAGCATGGG AAGTCGTTCT 2520 
AGTTCATCAG GGTCCCTGAA TGCTCGAAGC AGTGC AG A AG ATCGATCTCC AGAAAATACT 2580 
GGGTCCTCAG TAGCTGTGG A TA ACTTTCCA CA AGTAGATA AGGCCATGTT GATTG AGAG A 2640 

25 ATAGTTAGGC TGCAAAAAGC ACATGCCCGG AAAAATGAAA AGATAGAATT TATGGAGGAC 2700 
CACATCAAAC AACTGGTGGA AGAAATTAGG AAAAAAACAA AAATAATTCA AAGTTATATT 2760 
TTACG AG AAG AATC AGGC AC ACTTTCTTCA G AGGCATCTG ATTTTAACAA AGTTCATTTA 2820 
AGTAGACGGG GTGGCATCAT GGCATCTTTA TATACATCCC ATCCAGCTGA CAATGGATTA 2880 
AC ATTGG AGC TCTCTTTGGA AATCAACCGA A A ATTACAGG CTGTTTTGGA GGATACGTTA 2940 

30 CTAAAAAATA TTACTTTGAA GGAAAATCTA CAAACACTTG GAACAGAAAT AGAACGTCTT 3000 
ATTAAACACC AGCATGAACT AGAACAGAGG ACAAAGAAAA CCTAAAACAA GCCTCTTGCT 3060 
CAGTAAAGAG ACAAAAGCCA CACAGGAGTA GGTGCCACTG ACCTCTATTG TTGGAGACTT 3120 
TGTTCCACTT TTTGTTTCAG CCAGTAAAAA TATTGTTTTG CTTCATCTGT ACACAAAAAA 3180 
ATACCCTTTT ACAATATGAA TGCATTGCTG TATATACTGT AAGACTGAAA GCTTTGATGA 3240 

3 5 AAnTGnrr tgtatggtgc aatatg ac ag cctgtc attg a atctaa aca acttaatttg 3300 

CTTGTATTCA TAAGAAGTGT TG A AC ATT AC AAGGGCTTTT AT 



Ar . SEP ID NO:251 PBJ1 Protein sequence: 

4U Protein Accession #: NP_060487 

MVIIYLSFCN YYMEFYREEL PHIDYLIDIQ FATGKVTQPG EDTSYHQCAQ LEARDEGTDS 60 
LLLNNGSS AT LKTRTRC YGT PRGLPHRSLL QPTPPTCKTK IRSRFEELQS ELVP VSMSET 1 20 
DHIASTSSDK NVGKTPELKE DSCNLFSGNE SSKLENESKL LSLNTDKTLC QPNEHNNRIE 180 

45 AQENYEPDHG GGEDSCAKTD TGSENSEQIA NFPSGNFAKH ISKTNETEQK VTQILVELRS 240 
STFPESANEK TYSESPYDTD CTKKFISK1K SVSASEDLLE E1ESELLSTE FAEHRVPNGM 300 
NKGEHALVLF EKCVQDKYLQ QEHIIKKLIK ENKKHQELFV DICSEKDNLR EELKKRTETE 360 
KQHMNTIKQL ESRIEELNKE VKASRDQLIA QDVTAKNAVQ QLHKEMAQRM EQANKKCEEA 420 
RQEKEAMVMK YVRGEKESLD LRKEKETLEK KLRDANKELE KNTNK1KQLS QEKGRLHQLY 480 

50 ETKEGETTRL IREIDKLKED INSHV1KVKW AQNKLKAEMD SHKETKDKLK ETTTKLTQAK 540 
EEADQIRKNC QDMIKTYQES EEIKSNELDA KLRVTKGELE KQMQEKSDQL EMHHAKIKEL 600 
EDLKRTFKEG MDELRTLRTK VKCLEDERLR TEDELSKYKE HNRQKAEIQ NLLDKVKTAD 660 
QLQEQLQRGK QE1ENLKEEV ESLNSUNDL QKDIEGSRKR ESELLLFTER LTSKNAQLQS 720 
ESNSLQSQFD KVSCSESQLQ SQCEQMKQTN INLESRLLKE EELRKEEVQT LQAELACRQT 780 

5 5 EVKALSTQVE ELKDELVTQR RKHASSIKDL TKQLQQARRK LDQVESGSYD KEVSSMGSRS 840 
SSSGSLNARS SAEDRSPENT GSS VAVDNFP QVDKAMLIER IVRLQKAHAR KNEKIEFMED 900 
HIKQLVEEIR KKTKIIQSYI LREESGTLSS EASDFNKVHL SRRGGIMASL YTSHPADNGL 960 
TLELSLEINR KLQAVLEDTL LKNITLKENL QTLGTEIERL IKHQHELEQR TKKT 



60 
65 
70 
75 



SEQ ID NO:252 PBJ6 DNA sequence 
Nucleic Acid Accession*: DS3760 

Coding sequence: 56-1459 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I i I I 1 

TTGCCGTGAA GGGCTGTGCG GTTCCCGTGC GCGCCGGAGC CTGCTGTGGC CTCTTATGCA 60 

CTCCACCACC CCCATCAGCT CCCTCTTCTC CTTCACCAGC CCCGCAGTGA AGAGACTGCT 120 

AGGCTGGAAG CAAGGAGATG AAGAGGAAAA GTGGGCAGAG AAGGCAGTGG ACTCTCTAGT 180 

GAAGAAGTTA AAGAAGAAGA AGGGAGCCAT GG AC G AGC TG GAGAGGGCTC TCAGCTGCCC 24 0 

GGGGCAGCCC AGCAAATGCG TCACGATTCC CCGCTCCCTG GACGGGCGGC TGCAGGTGTC 300 

CCACCGCAAG GGCCTGCCCC ATGTGATTTA CTGTCGCGTG TGGCGCTGGC CGGATCTGCA 360 

GTCCCACCAC GAGCTGAAGC CGCTGGAGTG CTGTGAGTTC CCATTTGGCT CCAAGCAGAA 420 

AGAAGTGTGC ATTAACCCTT ACCACTACCG CCGGGTGGAG ACTCCAGTAC TGCCTCCTGT 480 

GCTCGTGCCA AGACACAGTG AATATAACCC CCAGCTCAGC CTCCTGGCCA AGTTCCGCAG 540 

CGCCTCCCTG CACAGTGAGC CACTCATGCC ACACAACGCC ACCTATCCTG ACTCTTTCCA €00 

GCAGCCTCCG TGCTCTGCAC TCCCTCCCTC ACCCAGCCAC GCGTTCTCCC AGTCCCCGTG 660 

CACGGCCAGC TACCCTCACT CCCCAGGAAG TCCTTCTGAG CCAGAGAGTC CCTATCAACA 720 

CTCAGTTGAC ACACCACCCC TGCCTTATCA TGCCACAGAA GCCTCTGAGA CCCAGAGTGG 780 
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CCAACCTGTA GATGCCACAG CTGATAGACA TGTAGTGCTA TCGATACCAA ATGGAGACTT 840 

TCGACCAGTT TGTTACGAGG AGCCCCAGCA CTGGTGCTCG GTCGCCTACT ATGAACTGAA 900 

CAACCGAGTT GGGGAGACAT TCCAGGCTTC CTCCCGAAGT GTGCTCATAG ATGGGTTCAC 960 

CGACCCTTCA AATAACAGGA ACAGATTCTG TCTTGGACTT CTTTCTAATG TAAACAGAAA 1020 

5 CTCAACGATA GAAAATACCA GGAGACATAT AGGAAAGGGT GTGCACTTGT ACTACGTCGG 1080 

GGGAGAGGTG TATGCCGAGT GCGTGAGTGA CAGCAGCATC TTTGTGCAGA GCCGGAACTG 1140 

CAACTATCAA CACGGCTTCC ACCCAGCTAC CGTCTGCAAG ATCCCCAGCG GCTGCAGCCT 1200 

CAAGGTCTTC AACAACCAGC TCTTCGCTCA GCTCCTGGCC CAGTCAGTTC ACCACGGCTT 1260 

TGAAGTCGTG TATGAACTGA CCAAGATGTG TACTATCCGG ATGAGTTTTG TTAAGGGTTG 1320 

10 GGGTGCTGAG TATCATCGCC AGGATGTCAC CAGCACCCCC TGCTGGATTG AGATTCATCT 1380 

TCATGGGCCA CTGCAGTGGC TGGACAAAGT TCTGACTCAG ATGGGCTCTC CACATAACCC 1440 
CATTTCTTCA GTGTCTTAAC AGTCATGTCT TAAGCTGCAT TTCCATAGGA T 



1 5 SEQ ID NO:253 P BJ6 Protein sequence: 

Protein Accession #: NPJW5896 

MHSTTPISSL FSFTSPAVKR LLGWKQGDEE EKWAEKAVDS LVKKLKKKKG AMDELERALS 60 
CPGQPSKCVT IPRSLDGRLQ VSHRKGLPH V IYCRVWRWPD LQSHHELKPL ECCEFPFGSK 1 20 

20 QKEVCINPYH YRRVETPVLP PVLVPRHSEY NPQLSLLAKF RS ASLHSEPL MPHNATYPDS 180 
FOQPPCSALP PSPSHAFSQS PCTASYPHSP GSPSEPESPY QHSVDTPPLP YHATEASETQ 240 
SGQPVDATAD RHWLSIPNG DFRPVCYEEP QHWCSVAYYE LNNRVGETFQ ASSRSVLIDG 300 
FTDPSNNRNR FCLGLLSNVN RNSTIENTRR HIGKGVHLYY VGGEVYAECV SDSSIFVQSR 360 
NCNYQHGFHP ATVCKIPSGC SLKVFNNQLF AQLLAQSVHH GFEVVYELTK MCTIRMSFVK 420 

25 GWGAEYHRQD VTSTPCWIEI HLHGPLQWLD KVLTQMGSPH NPISSVS 



SEQ ID NO:254 PBJ8 DNA sequence 
Nucleic Acid Accession*: AB04684 
30 Coding sequence: 472-4377 (underlined sequence corresponds to start and stop codon) 

i 11 21 31 41 51 

TGCAGGTTTG CAGGGTCTGA GATTACTTGG GCTTTTCCTG CCTTTTTCTT TTGCTTAAGG 60 

35 GATGGACAAG GAGCTGAGAT TTATGACCCT TATTAGAGAA AAAAATGTGC CTTGCTAGGG 120 

TGGGGACACT TGGTTGATGC AGTCTCTCTC TCTCTTTCTC GGTGTTTATA ACAAAACAAA 180 

ACCAAAATGA ACTGAGGGGT TTGTAATGGT AGTTTGTTTG TTGCTGGAGA ATGCTACTTT 240 

GCATGCTTTT TTTCTCTTGC AGGGTATGTT CTGTCTTGTG CTTTTTCTTT T AG AAGC T AC 300 

TAAAGGGTGT TGGGGATGCT TCTGACTATT ATGAAGGCCA AAAGGCCTGT TGACTGGGGC 360 

40 TGCTTTTAAC CCTTTCCTAT TTGCTGAGAA TGCAGCCGTG TGACAGTAAC TGAACATTGG 420 

TCTAAAGTCT TTCCAAAAGG TCAAGGTTCA CAAGAACATC TGCTCAAATT AATGACCATG 480 

GGGGATATGA AGACCCCAGA CTTTGATGAC CTCCTGGCAG CATTTGACAT CCCAGATATG 540 

GTCGATCCTA AAGCAGCTAT TGAGTCTGGA CACGATGACC ATGAAAGCCA CATGAAGCAG 600 

AATGCTCACG GAGAGGATGA CTCCCACGCA CCATCATCTT CTGATGTGGG TGTCAGCGTT 660 

45 ATCGTCAAGA ATGTTCGGAA CATTGACTCT TCCGAGGGCG GGGAGAAAGA CGGCCACAAC 720 

CCCACTGGCA ATGGCTTACA TAATGGGTTT CTCACAGCAT CCTCCCTTGA CAGTTACAGT 780 

AAAGATGGAG CAAAGTCCTT GAAAGGAGAT GTGCCTGCCT CTGAGGTGAC ACTGAAAGAC 840 

TCGACATTCA GCCAGTTTAG CCCGATCTCC AGTGCTGAAG AGTTTGATGA CGACGAGAAG 900 

ATTGAGGTGG ATGACCCCCC TGACAAGGAG GACATGCGAT CAAGCTTCAG GTCGAATGTG 960 

50 TTGACGGGGT CGGCTCCCCA GCAGGACTAC GATAAGCTGA AGGCACTCGG AGGGGAAAAC 1020 

TCCAGCAAAA CTGGACTCTC TACGTCAGGC AATGTGGAGA AAAACAAAGC TGTTAAGAGA 1080 

GAAACAGAAG CCAGTTCTAT AAACCTGAGT GTTTATGAAC CTTTTAAAGT CAGAAAAGCA 1140 

GAGGATAAAT TGAAGGAAAG CTCTGACAAG GTGCTGGAAA ACAGAGTCCT AGATGGGAAG 1200 

CTGAGCTCCG AGAAGAATGA CACCAGCCTC CCCAGCGTTG CGCCATCAAA GACAAAGTCG 1260 

55 TCCTCCAAGC TCTCGTCCTG CATCGCTGCC ATCGCGGCTC TCAGCGCTAA AAAGGCGGCT 1320 

TCAGACTCCT GCAAAGAACC AGTGGCCAAT TCGAGGGAAT CCTCCCCGTT ACCAAAAGAA 1380 

GTAAATGACA GTCCGAGAGC CGCTGACAAG TCTCCTGAAT CCCAGAATCT CATCGACGGG 1440 

ACCAAAAAAC CATCCCTGAA GCAACCGGAT AGTCCCAGAA GCATCTCAAG TGAGAACAGC 1500 

AGCAAAGGAT CCCCGTCCTC TCCCGCAGGG TCCACACCAG CAATCCCCAA AGTCCGCATA 1560 

60 AAAACCATTA AGACATCTTC TGGGGAAATC AAGAGAACAG TGACCAGGGT ATTGCCAGAA 1620 

GTGGATCTTG ACTCTGGAAA GAAACCTTCC GAGCAGACAG CGTCCGTGAT GGCCTCTGTG 1680 

ACATCCCTTC TGTCGTCTCC AGCATCAGCC GCCGTCCTTT CCTCTCCCCC CAGGGCGCCT 1740 

CTCCAGTCTG CGGTCGTGAC CAATGCAGTT TCCCCTGCAG AGCTCACCCC CAAACAGGTC 1800 

ACAATCAAGC CTGTGGCTAC TGCTTTCCTC CCAGTGTCTG CTGTGAAGAC GGCAGGATCC 1860 

65 CAAGTCATTA ATTTGAAGCT CGCTAACAAC ACCACGGTGA AAGC C ACGGT CATATCTGCT 1920 

GCCTCTGTCC AGAGTGCCAG CAGCGCCATC ATTAAAGCTG CCAACGCCAT CCAGCAGCAA 1980 

ACTGTCGTGG TGCCGGCATC CAGCCTGGCC AATGCCAAAC TCGTGCCAAA GACTGTGCAC 2040 

CTTGCCAACC TTAACCTTTT GCCTCAGGGT GCCCAGGCCA CCTCTGAACT CCGCCAAGTG 2100 

CTAACCAAAC CTCAGCAACA AATAAAGCAG GCAATAATCA ATGCAGCAGC CTCGCAACCC 2160 

70 CCCAAAAAGG TGTCTCGAGT CCAGGTGGTG TCGTCCTTGC AGAGTTCTGT GGTGGAAGCT 2220 

TTCAACAAGG TGCTGAGCAG TGTCAATCCA GTCCCTGTTT ACATCCCAAA CCTCAGTCCT 2280 

CCCGCCAATG CAGGGATCAC GTTACCGACG CGTGGGTACA AGTGCTTGGA GTGTGGGGAC 2340 

TCCTTTGCAC TTGAAAAGAG TCTGACCCAG CACTACGACA GACGGAGCGT GCGCATCGAA 2400 

GTAACGTGCA ACCATTGTAC AAAGAACCTC GTTTTTTACA ACAAATGCAG CCTCCTTTCC 2460 

75 CATGCCCGTG GGCATAAGGA GAAAGGGGTG GTAATGCAAT GC TCCCACTT AATTTTAAAG 2520 

CCAGTCCCAG CAGATCAAAT GATAGTTTCT CCGTCAAGCA ATACTTCCAC TTCAACTTCC 2 580 

ACTCTTCAGA GCCCTGTGGG AGCTGGCACA CACACTGTCA CAAAAATTCA GTCTGGCATA 2640 

ACTGGGACAG TCATATCGGC TCCTTCAAGC ACTCCCATCA CCCCAGCCAT GCCCCTAGAT 2700 

GAAGACCCCT CCAAACTGTG TAGACATAGT CTAAAATGTT TGGAGTGTAA TGAAGTCTTC 2760 

80 CAGGACGAGA CATCACTGGC TACACATTTC CAGCAGGCTG CAGATACGAG TGGACAAAAG 2820 
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ACTTGCACTA TCTGCCAGAT GCTGCTTCCT AACCAGTGCA GTTATGCATC ACACCAGAGA 2880 

ATCCATCAGC ACAAATCTCC CTACACCTGC CCTGAGTGTG GGGCCATCTG CAGGTCGGTG 2940 

CACTTCCAGA CCCACGTCAC CAAGAACTGT CTGCACTACA CGAGGAGAGT TGGTTTTCGA 3000 

TGTGTGCATT GCAATGTTGT GTACTCTGAT GTGGCTGCTC TGAAGTCTCA CATTCAAGGT 3060 

5 TCTCACTGTG AAGTCTTCTA CAAGTGTCCT ATTTGTCCAA TGGCGTTTAA GTCTGCCCCA 3120 

AGCACACATT CCCACGCCTA CACACAGCAT CCTGGCATCA AGATAGGAGA ACCAAAAATA 3180 

ATATATAAGT GTTCCATGTG CGACACTGTG TTCACCCTGC AAACCTTGCT GTATCGCCAC 3240 

TTTGACCAAC ACATTGAAAA CCAGAAGGTG TCTGTTTTCA AGTGTCCAGA CTGTTCTCTT 3300 

TTATATGCAC AGAAGCAACT TATGATGGAC CATATCAAGT CTATGCATGG AACATTGAAA 3360 

10 AGTATTGAAG GGCCTCCAAA CTTGGGTATA AACTTGCCTT TGAGCATTAA GCCTGCAACT 3420 

CAAAATTCAG CAAATCAGAA CAAAGAGGAC ACCAAATCCA TGAATGGGAA AGAGAAATTG 3480 

GAAAAGAAAT CTCCATCTCC TGTGAAAAAA TCAATGGAAA CCAAGAAAGT GGCCAGTCCT 3540 

GGGTGGACGT GTTGGGAGTG TGACTGCCTG TTCATGCAGA GAGATGTGTA CATATCCCAC 3600 

GTGAGGAAGG AGCACGGGAA GCAAATGAAG AAACACCCCT GCCGCCAGTG TGACAAGTCT 3660 

15 TTCAGCTCGT CCCACAGCCT GTGCCGGCAC AACCGGATCA AGCACAAAGG CATCAGGAAA 3720 

GTGTACGCCT GCTCGCACTG CCCAGACTCC AGACGTACCT TTACCAAACG TTTGATGCTG 3780 

GAGAAGCACG TCCAGCTGAT GCATGGCATC AAGGACCCTG ACCTGAAAGA AATGACAGAT 3840 

GCCACCAATG AGGAGGAAAC AGAAATAAAA GAAGACACTA AGGTCCCCAG TCCCAAGCGG 3900 

^ n AAGTTGGAAG AACCAGTTCT GGAGTTCAGG CCTCCCCGAG GAGCAATCAC TCAACCACTG 3960 

20 AAAAAGCTGA AAATCAATGT TTTTAAGGTT CACAAGTGTG CCGTGTGTGG CTTCACCACC 4020 

GAAAACCTGC TGCAATTCCA CGAACACATC CCTCAGCACA AATCGGATGG TTCTTCCTAC 4080 

CAGTGCCGGG AGTGTGGCCT CTGCTACACG TCTCACGTCT CTCTGTCCAG GCACCTCTTC 414 0 

ATCGTACACA AGTTAAAGGA ACCTCAGCCA GTGTCCAAGC AAAATGGGGC TGGGGAAGAT 4200 

AACCAACAGG AGAACAAACC CAGCCACGAG GATGAATCCC CTGATGGCGC CGTGTCAGAC 4260 

25 AGAAAGTGCA AAGTGTGCGC AAAAACTTTT GAAACTGAAG CTGCCTTAAA TACTCACATG 4320 

CGGACACACG GCATGGCCTT CATCAAATCC AAAAGGATGA GCTCAGCCGA GAAATAGCCA 4380 

CAGATGCTCC ATGAGGAAAA TCCCTGTCCA CATTGGAATA AAAAAGACAT TTTTGTTACA 4440 

AAGTTTGCAG TATAATAGAG TTAACAGTAC TGTCTAGGCT GTTGCAATAT ATTCTCTTTC 4500 

AATGTACCTT CCTTCACCTC GTCGTATATA TCCTCGATAA GTATTAAAAC AGTATTTGAG 4560 

30 TTTAAAAGAG TTTGTATATA TTTAAATGAA TAACTTTTTA TACTCTTTGT TACATGTTTG 4620 

TATCAGTATT TAGTGGAAAA CCATTTGAGT TGTTTTGGGT TAGAATTTTT CTTTTTGTAC 4680 

TGTTTCTTTA AAACAGAGTT CTTAGTAACA GGGGCAGTTC CTGAATTCAA ATAAACCATT 4740 

TTGTATGTTT GGATTTTGAA TGGGTTAACT AATTACAGGC TAAAATAATG CCTTTTTTAG 4800 

TGTTTTTAAT TTTTAGAATT CACTACATAA ATTGTAAGTA ATTGTGGGTC TCAAAAACAC 4860 

35 TAGGAACTTT TAAGTGTCTT AGCACTTCCT CGATGTGCCT GCCCTGAGGG AGTGAGTTCA 4920 

CATTTGAGAC AACTGCACTC CAGTGTGGAC GTGCCTTTGT CTTCAGGCCA TGCCGAAGGG 4980 

TGTTTAAAGC AGTCTTGCAG GTCGCTCCTT TCCCAGCCGT GGATAAAAAC TGAAGCTAGG 5040 

AATCTAATAA GGAATGCTGA TTTCCTCAGT TCCATTTTGA GGAATGGGGA AGGCTATTCT 5100 

AAAGAAAAAA ATGGGATTTG TTTTCTCGGC AGATCTGCAA GGCTGGCTTT AAGAGCACAA 5160 

40 GGAGGGAAAG TAACGAAAGG GCTGGACTAC TATAAAAGTT ACAAATACGT AGTTAGACCA 5220 

ATAGATTTAT ATAGTCAGGT TTTTGTCATG TAATTTATTA ACTAACTATT ACAGAAACAC 5280 

AGCTAAGAAT ATCAAGTATT TCTCTGGCTC TTGACAGAAA AAAATCAGTT GACTTAACCC 5340 

TTTGCTGTCA AAAGAGTTGG CGTTTCCTGT TCTGGGTGCT ACTGCCAAAC GTTATGGTAC 5400 

TTAGAGTCGG GATGCACAAC TTCAACCACC GACTTATCAA TGCAGCCGCC TGTGTATTGC 5460 

45 AATTGGCCGT TACCTTAAGC ACTGAGCCAC CCGGGTTTAG TTCAGCCATT TCAAGAAGTA 5520 

TATTTAACGT CGGTAGTTCT GCTTTATTAA AATGCAGCAG AGGTACTCTT CTGTCCCTTC 5580 

CGTTTATAGT TCTCTGAGAG AGTTCTATTT TTTGGTTTTG TTTTGTGTTT TCTTTTGCAT 5640 

TTTGTATCTT GTATTTATCC CTGAACATGT TTTGTACCTT TTTTTTTTTT TTTTTTTTAA 5700 

GAAAAGGAAT TCTTTTGTGT ATATATAGAT ACTTGCATGA TATACTGTAG TCAATGTTCG 5760 

50 GTTCCTCAAA AGGTCTTGCT GCTGTCAGGT GTTATGCACT CCATCCATCA TAACTGTATG 5820 
AAACACATTT CATATGTAAA TAAACGTGGG ACATTTG 



SEQ ID NO:255 PBJ8 Protein sequence: 
5 5 Protein Accession #: BAB1 3455 

MKTPDFDDLL AAFDIPDMVD PKAAIESGHD DHESHMKQNA HGEDDSHAPS SSDVGVSVIV 60 
KNVRN1DSSE GGEKDGHNPT GNGLHNGFLT ASSLDS YSKD G AKSLKGDVP ASEVTLKDST 1 20 
FSQFSPISSA EEFDDDEKIE VDDPPDKEDM RSSFRSNVLT GSAPQQDYDK LKALGGENSS 180 

60 KTGLSTSGNV EKNKA V KRET EASSINLSVY EPFKVRKAED KLKESSDKVL ENRVLDGKLS 240 
SEKNDTSLPS VAPSKTKSSS KLSSCIAAIA ALSAKKAASD SCKEPVANSR ESSPLPKEVN 300 
DSPRAADKSP ESQNLIDGTK KPSLKQPDSP RSISSENSSK GSPSSPAGST PAIPKVRIKT 360 
IKTSSGEIKR TVTRVLPEVD LDSGKKPSEQ TASVMAS VTS LLSSPAS AAV LSSPPRAPLQ 420 
S AVVTNAVSP AELTPKQVTI KPVATAFLPV SAVKTAGSQV INLKLANNTT VKATVISAAS 480 

65 VQSASSAIIK AANAIQQQTV VVPASSLANA KLVPKTVHLA NLNLLPQGAQ ATSELRQVLT 540 
KPQQQIKQAI INAAASQPPK KVSRVQVVSS LQSSVVEAFN KVLSS VNPVP VYIPNLSPPA 600 
NAGITLPTRG YKCLECGDSF ALEKSLTQHY DRRSVRIEVT CNHCTKNLVF YNKCSLLSHA 660 
RGHKEKGVVM QCSHLILKPV PADQMIVSPS SNTSTSTSTL QSPVGAGTHT VTKIQSGITG 720 
TVISAPSSTP ITPAMPLDED PSKLCRHSLK CLECNEVFQD ETSLATHFQQ AADTSGQKTC 780 

70 TICQMLLPNQ CSYASHQRIH QHKSPYTCPE CGAICRS VHF QTHVTKNCLH YTRRVGFRCV 840 
HCNVV YSDVA ALKSHIQGSH CEVFYKCPIC PMAFKSAPST HSHAYTQHPG IKIGEPKIIY 900 
KCSMCDTVFT LQTLLYRHFD QHIENQKVSV FKCPDCSLLY AQKQLMMDHI KSMHGTLKSI 960 
EGPPNLX3INL PLSIKPATQN SANQNKEDTK SMNGKEKLEK KSPSPVKKSM ETKKVASPGW 1020 
TCWECDCLFM QRDVYISHVR KEHGKQMKKH PCRQCDKSFS SSHSLCRHNR IKHKGIRK V Y 1080 

75 ACSHCPDSRR TFTKRLMLEK HVQLMHGIKD PDLKEMTDAT NEEETEIKED TKVPSPKRKL 1 140 
EEPVLEFRPP RGAITQPLKK LKINVFKVHK CAVCGFTTEN LLQFHEHIPQ HKSDGSSYQC 1200 
RECGLCYTSH VSLSRHLFIV HKLKEPQPVS KQNGAGEDNQ QENKPSHEDE SPDGAVSDRK 1260 
CKVCAKTFET EAALNTHMRT HGMAFIKSKR MSSAEK 



80 
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SEQ IP NO:256 P6M 1 DNA sequence 
Nucleic Acid Accession*: AF111B47 

Coding sequence: 58- 1 608 (underlined sequence corresponds to start and stop codon) 

5 1 11 21 31 41 51 

I i i I I t 

TTTTCGTCGA CTCTTACCGG TTGGCTGGGC CAGCTGCGCC GCGGCTCACA GCTGACGATG 60 

GGGGACCCCA GCAAGCAGGA CATCTTGACC ATCTTCAAGC GCCTCCGCTC GGTGCCCACT 12 0 

AACAAGGTGT GTTTTGATTG TGGTGCCAAA AATCCCAGCT GGGCAAGCAT AACCTATGGA 180 

10 GTGTTCCTTT GCATTGATTG CTCAGGGTCC CACCGGTCAC TTGGTGTTCA CTTGAGTTTT 240 

ATTCGATCTA CAGAGTTGGA TTCCAACTGG TCATGGTTTC AGTTGCGATG CATGCAAGTC 300 

GGAGGAAACG CTAGTGCATC TTCCTTTTTT CATCAACATG GGTGTTCCAC CAATGACACC 360 

AATGCCAAGT ACAACAGTCG TGCTGCTCAG CTCTATAGGG AGAAAATCAA ATCGCTCGCC 420 

TCTCAAGCAA CACGGAAGCA TGGCACTGAT CTGTGGCTTG ATAGTTGTGT GGTTCCACCT 480 

15 TTGTCCCCTC CACCAAAGGA GGAAGATTTT TTTGCCTCTC ACGTTTCTCC TGAGGTGAGT 540 

GACACAGCGT GGGCATCAGC AATAGCAGAA CCATCTTCTT TAACATCAAG GCCTGTGGAA 600 

AC C AC TTTGG AAAATAATGA AGGTGGACAA GAGCAAGGAC CAAGTGTGGA AGGTC TTAAT 660 

GTACCAACAA AGGCTACTTT AGAGGTATCC TCTATCATAA AAAAGAAACC AAATCAAGCT 720 

AAAAAAGGCC TTGGGGCCAA AAAAGGAAGT TTGGGAGCTC AGAAACTGGC AAACACATGC 780 

20 TTTAATGAAA TTGAAAAACA AGCTCAAGCT GCGGATAAAA TGAAGGAGCA GGAAGACCTG 840 

GCCAAGGTGG TATCTAAAGA AGAATCAATT GTTTCATCAT TACGATTAGC CTATAAGGAT 900 

CTTGAAATTC AAATGAAGAA AGACGAAAAG ATGAACATTA GTGGCAAAAA AAATGTTGAC 960 

TCAGACAGAC TCGGCATGGG ATTTGGAAAT TGCAGAAGTG TTATTTCACA TTCAGTGACT 1020 

TCAGATATGC AGACCATAGA GCAGGAATCA CCCATTATGG CAAAACCAAG AAAAAAGTAT 1080 

25 AATGATGACA GTGACGATTC ATATTTTACT TCCAGCTCAA GTTACTTTGA CGAGCCAGTG 1140 

GAGTTAAGGA GC AG TTCTTT CTCTAGCTGG GATGACAGTT CAGATTCCTA TTGGAAAAAA 1200 

GAGACCAGCA AAGATACTGA AACAGTTCTG AAAACCACAG GCTATTCAGA CAGACCTACT 1260 

GCTCGCCGCA AGCCAGATTA TG AGO CAGTT GAAAATACAG ATGAGGCCCA GAAGAAGTTT 1320 

GGCAATGTCA AGGCCATTTC ATCAGATATG TATTTTGGAA GACAATCCCA GGCTGATTAT 1380 

30 GAGACCAGGG CCCGCCTAGA GAGGCTGTCG GCAAGTTCCT CCATAAGCTC GGCTGATCTG 1440 

TTCGAGGAGC CGAGGAAGCA GCCAGCAGGG AACTACAGCC TGTCCAGTGT GCTGCCCAAC 1500 

GCCCCCGACA TGGCGCAGTT CAAGCAGGGA GTGAGATCGG TTGCTGG AAA ACTCTCCGTC 1560 

TTTGCTAATG GAGTCGTGAC TTCAATTCAG GATCGCTACG GTTCTTAATA CTGAAGTCAT 1620 

GATGTGTATT TCCTGGAGAA ATTCCTCTTT AAATGAACAA GTAACCACAT CTCAGGCGGC 1680 

35 AGTGAAGTCC AGATAGTTTT GCAGATTGTT TTGCTACTTT TTCATATGGT ATATGTTTCT 1740 

GATTTTTAAT ATTTCTTTTG AGAAATTCTG AGTTCTGATG TAGGAGCTTT CCTGTGATTT 1800 

CTGTTTCACG TTCCTTCCTG TCACACCCTC CTTTGGCGTC TCTGTGTATA TCCTTGCTTT 1860 

ATTTTCTTGG AACCTTTGAT TTCAACACTG AGGGCCTGGA GACCTCGGCT CCTCCTGCTC 1920 

CTGAACCAGG AGGCTTCATG TGGGGGAGGA GGAGAGGTCT CCATGTGACA CATGGGCTCA 1980 

40 GGGCTGCCAG AATCAGCGGA TGCTGGATGG GCCTGCAGAA ACAACACTCA CCACACACAC 2040 

TTCCTTCAAA AGACCAAAAG TGACTGGTGT CTCGTGTGAC AGATTGCTTC ATTTATGTTT 2100 

CTACATAGTA AGGTGACTGC CAAATAATAT TTGAAGTCAT CTGTCTCTTT GTAAATTATT 2160 

TTATATGACC TATAAATTTA AAAATGTTTT TCAGTGAGTG CTTTTAACAA ACTTAAGCTT 2220 

A CTGCCCTGCC AAGGGAATTA ATGTTATCTT GTGAAAGGTG TTGCTGTTTG AATTGATGAG 2280 

45 AAATGGAAGA TGAGAACTCC CTAAGAGTTC TCATAATAAA TCATCTCATC ACAAATCAAT 2340 

ACGGTATACA GAGTTAAAGT GGAATGAGGT AAGAAGATAC AGCTACAGAA AATAGTTGCG 2400 

TGTATGGGAG AACAGTCATT GTAATTGGGT AGTTTTGTTA ATAAATATTT TTAAATCTTG 2460 

CTTTTCAGAA ATTACCGAAT GTGTATAAAC AAATAAAGAA AAATAATTTA GCTGTGTTTT 2520 

AGACAGCATT AGAATATATT GTTCAGCACA GTAAAATATA TTTGAAATTT GATAAGCCAA 2580 

50 AAATGTGGTT TTGAATGAAT ATTTTGTGAA TCTTTCTTAA AAGCTCAAAT TTGTAGACTT 2640 

CTAAATAGAA TAAACACTTG CAGCAGAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2700 

AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2760 
AAAAAAAAAA AAAAAAAAAA AAAAAAAA 

55 

SEQ ID NO:257 PBM1 Protein sequence: 
PBM1 Protein sequence: CAB76901 

MGDPSKQDIL TIFKRLRSVP TN K VCFDCG A KNPSWASITY GVFLCJDCSG SHRSLGVHLS 60 
60 FIRSTELDSN WSWFQLRCMQ VGGNASASSF FHQHGCSTND TNAKYNSRAA QLYREKIKSL 120 

ASQATRKHGT DLWLDSCVVP PLSPPPKEED FFASHVSPEV SDTAWASAIA EPSSLTSRPV 180 
ETTLENNEGG QEQGPSVEGL NVPTKATLEV SSIIKKKPNQ AKKGLGAKKG SLGAQKLANT 240 
CFNEIEKQAQ A ADKM KEQED LAKVVSKEES IVSSLRLAYK DLEIQMKKDE KMNISGKKNV 300 
DSDRLGMGFG NCRS VISHSV TSDMQTIEQE SPIMAKPRKK YNDDSDDSYF TSSSSYFDEP 360 
65 VELRSSSFSS WDDSSDS Y WK KETS KDTETV LKTTGYSDRP TARRKPDYEP VENTDEAQKK 420 

FGNVKAISSD MYFGRQSQAD YETRARLERL SASSSISSAD LFEEPRKQPA GNYSLSS VLP 480 
NAPDMAQFKQ GVRSVAGKLS VFANGVVTSI QDRYGS 



70 SEQ ID NO:258 PBM4 DNA sequence 
Nucleic Acid Accession*: D 30891 

Coding sequence: 1-4032 (underlined sequence corresponds to start and stop codon) 

_ _ ATGXjATACTG TCATG AAGCA GACACATGCT GACACACCTG TTGATCATTG TCTATCTGGC 60 
7 5 ATAAG AAAGT GTAGCAGCAC CTTTAAGCTT AAAAGTGAAG TC A AC AAGCA TGAAACAGCC 120 
CTTGAAATGC AGAATCCAAA TTTGAACAAT AAAGAATGTT GTTTCACCTT TACGTTG A AT 180 
GGAAACTCCA GAAAATTAGA CCGTAGTGTG TTTACAGCAT ATGGTAAACC CAGCGAGAGT 240 
ATCTACTCAG CCCTGAGTGC TAATGACTAT TTCAGTGAAA GGATAAAGAA TCAGTTTAAT 300 
AAGAACATTA TTGTTTATGA AG AAA AG AC A ATAGATGGAC ATATAAATTT AGGAATGCCT 360 
80 CTCAAGTGCC TGCCTAGTGA TTCTCATTTT AAAATTACAT TTGGTCAAAG AAAGAGTAGC 420 
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AAAGAAGATG GACACATATT ACGCCAATGT GAAAATCCAA ACATGGAATG CATTCTTTTT 480 
CATGTTGTTG CTATAGGAAG GACAAGAAAG AAGATTGTTA AGATCAACGA ACTTCATGAA 540 
AA AGGAAGTA AACTTTGTAT TTATGCCTTG AAGGGTG AGA CTATTGAAGG AGCCTTATGC 600 
AAGG ATGGCC GTTTTCGGTC TGACATAGGT GAATTTGAAT GGAAACTAAA GGAAGGTCAT 660 
5 AAGAAAATTT ATGG AAAACA GTCCATGGTG G ATGAAGTAT CTGGAAAAGT CTTAGAAATG 720 
GACATTTCAA AAAAAAAAGC ATTACAACAG AA AG ATATCC ATAAAAAAAT TAAACAGAAT 780 
GA AAGTGCCA CTG ATGAAAT TAATCACCAG AGTCTGATAC AGTCTAAGAA AAAAGTCCAC 840 
AA ACCAAAG A AAGATGGAG A GACCAAAGAT GT AG A AC AC A GCAGAGAGCA AATTCTCCCA 900 
CCTCAGGATC TAAGCCATTA TATTAAAGAT AAAACTCGCC AGACAATTCC CAGGATTAGA 960 

1 0 AATTATTACT TTTGTAGTTT GCCCCGAAAA TATAGGCAAA TAAACTCACA AGTTAGACGG 1020 
AGGCCGCATC TGGGTAGGCG GTATGCTATT AATCTGGATG TCCAAAAGGA GGCAATTAAT 1080 
CTCTTAAAGA ATTATCAAAC GTTGAATGAA GCCATAATGC ATCAGTATCC GAATTTTAAA 1 140 
GAGGAGGCAC AGTGGGTAAG AAAATATTTT CGGGAAGAAC AAAAG AGAAT GAATCTTTCA 1200 
CCAGCTAAGC A ATTC A AC AT ATATAAAAAG G ACTTCGG A A AAATGACTGC AAATTCTGTT 1260 

1 5 TCAGTTGCAA CCTGCGAAC A GCTT ACATAT TATAGCA AGT C AGTTGGGTT CATGC AATGG 1 320 
GACAATAATG GAAACACAGG TAATGCT ACT TGCTTTGTCT TCAATGGTGG TTATATTTTC 1380 
ACCTGTCGAC ATGTTGTACA TCTTATGGTG GGTAAAAACA CACATCCAAG TTTGTGGCCA 1440 
GATATAATTA GCAAATGTGC G AAGGTAACC TTCACTTATA CAGAGTTCTG CCCTACTCCT 1500 
GACAATTGGT TTTCC ATTG A GCCATGGCTT AAAGTGTCC A ATGAAAATCT AG ATTATGCC 1560 

20 ATTTTAAAAC TAAAAGAAAA TGGAAATGCG TTTCCTCCAG GACTATGGCG ACAGATTTCT 1620 
CCTCAACCAT CTACTGGTTT GATTTATTTA ATTGGTCATC CTGAAGGCCA GATCAAGAAA 1680 
ATAGATGGTT GTACTGTGAT TCCTCTAAAC GAACGATTGA A AAA ATATCC AAACGATTGT 1740 
CAAGATGGGT TGGTAGATCT CTATGATACC ACCAGTAATG TATACTGTAT GTTTACCCAA 1800 
AGAAGTTTCC TATCAG AGGT TTGGAACACA CACACGCTTA GTTATGATAC TTGTTTCTCT 1860 

25 GATGGGTCCT C AGGCTCCCC AGTGTTTA AT GC ATCTGGCA AATTGGTTGC TTTGC ATACC 1 920 
TTTGGGCTTT TTTATCAACG AGGATTTAAT GTGCATGCCC TTATTGAATT TGGTTATTCT 1980 
ATGGATTCTA TTCTTTGTGA TATTAAAAAG ACAAATGAG A GCTTGTATAA ATCATTAAAT 2040 
GATGAGAAAC TTGAGACCTA CGATGAAGAG AAAGCCCGGC CCAGGCCAGC CTACCGGCGA 2100 
CTAGGATGCT TTCGCTTTCG CTCTCGCTTT CCAATACTCG GGACTGGGG A AACCGGGAG A 21 60 

30 ATAGAAGCAG GCAAGGACCG CCGTGGGCAC GGGGTCAGTG AGACAGGGTC CTGCTCGCGG 2220 
CGTCAAGGAG GAGCGCTGTG GGTGTCCCCA GCGCAGCCAA TCGGCTTCCG AAGTAGCTGG 2280 
AGCTCTGGAG CCTTTGCTTC CTCAAATACG AGCGGGAACT GCGTTGAGCG CTGGATTCCA 2340 
GGCCGAGTGC TGGCGAGGCG CGCAGTCTCT AAAGAGCAAC AGAAT AATTG CAGTACTTCT 2400 
CTAATGAGGA TGGAGTCTAG AGG AG ACCCA AGAGCCACAA CTAATACCCA GGCTCAAAGA 2460 

3 5 TTCC ATTCAC CTAAG A AA A A TCCAGAAGAC C AG ACC ATGC CCCAAAATAG G ACAATATAT 2520 
GTT ACCTTG A AGGCTGTCAG AAAAGAGATA GAAACTCACC AAGGCCAAGA AATGCTTGTG 2580 
CGTGGCACAG AAGGAATCAA AGAGTACATA AACCTTGGAA TGCCCCTCAG TTGTTTCCCT 2640 
GA AGGTGGCC AGGTGGTCAT TACATTTTCC CAAAGTAAAA GTAAGCAGAA GGAAGATAAC 2700 
CACATATTTG GCAGGCAGGA CAAAGCATCG ACTGAATGTG TCAAATTTTA CATTCATGCA 2760 

40 ATTGG AATTG GGAAGTGTAA AAGAAGGATT GTTAAATGTG GGAAGCTTCA CAAAAAGGGG 2820 
CGCAAACTCT GTGTTTATGC TTTCAAAGGA GAAACCATCA AGGATGCACT GTGCAAGGAT 2880 
GGCAGATTTC TTTCCTTTCT GGAGAATGAT GATTGGAAAC TCATTGAAAA CAATGACACC 2940 
ATTTTAGAAA GCACCCAGCC AGTTGATGAA TTAGAAGGCA GATACTTTCA GGTTGAGGTT 3000 
GAGAAAAGAA TGGTCCCCAG TGCAGCAGCT TCTCAGAATC CTGAGTCAGA GAAAAGAAAC 3060 

45 ACCTGTGTGT TGAGAGAACA AATCGTGGCT CAGTACCCCA GTTTGAAAAG AGAAAGTGAA 3120 
AAAATCATTG AAAACTTCAA GAAAAAAATG AAAGTAAAAA ATGGGGAAAC ATTATTTGAA 3180 
TTGCATAGAA CAACGTTTGG GAAAGTAACA AAAAATTCTT CTTCGATTAA AGTAGTGAAA 3240 
CTTCTTGTAC GTCTCAGTGA CTCAGTTGGG TACTTATTCT GGGACAGTGC AACTACGGGT 3300 
TACGCCACCT GCTTTGTTTT TAAAGG ATTG TTC ATTTTAA CTTGTCGGCA TGTAATAGAT 3360 

50 AGCATTGTGG GAGACGGAAT AGAGCCAAGT AAGTGGGCAA CCATAATTGG TCAATGTGTA 3420 
AGGGTGACAT TTGGTTATGA AGAGCTAAAA GACAAGGAAA CAAACTACTT TTTTGTTGAA 3480 
CCTTGGTTTG AGATACATAA TGAAGAGCTT GACTATGCTG TCCTGAAACT GAAGGAAAAT 3540 
GGACAACAAG TACCTATGGA ACTATATAAT GGAATTACTC CTGTGCCACT TAGTGGGTTG 3600 

_ _ ATACATATTA TTGGCCATCC ATATGGAG AA AAAAAGCAGA TTGATGCTTG TGCTGTGATC 3660 

5 5 CCTCAGGGTC AGCG AGC AAA G A AATGTCAG GAACGTGTTC AGTCTAAAAA AGCAGAAAGT 3720 

CCAGAGTATG TCCATATGTA TACTCAAAGA AGTTTCCAGA AAATAGTTCA CAACCCTGAT 3780 
GTG ATTACCT ATG ACACTG A ATTTTTCTTT GGGGCTTCCG GCTCCCCTGT GTTTGATTCA 3840 
AAAGGTTCAT TGGTGGCCAT GCATGCTGCT GGCTTTGCTT ATACTTACCA AAATGAGACT 3900 
CGTAGTATC A TTG AGTTTGG CTCTACCATG G AATCCATCC TCCTTG ATAT TAAGCAAAGA 3960 
60 CATAAACCAT GGTATGAAGA AGTATTTGTA AATCAGCAGG ATGTAGAAAT GATGAGTGAT 4020 
GAGGACTTG T GA GAATTCAG TCTACTGGAT TTAAGGGAAT GGCTTATGGA GTTGTTATTT 4080 
CGTAGGCATT GAAAATGGTT TTCTAAACTC CAAAATGGTC ATCTTATCAA TAATAATAAT 4140 
ATTGACCATT TCCTATCTGC CAGGCATTTT TCTAAGCACA TGAAGAAATT AGTCCTAACA 4200 
AC ACT ATG AG ATGGACTATA ACTTGCCCAA ATTTTTTTTT TTTTTGAGAC TGAGTCTCAC 4260 

6 5 TCTGTCGCCT GGGCTGG AGT ACAGTGGTGC G ATCTCAGCT CACTGC AACT TCC ACCTCCC 4320 

AGGTTCAAGC GATTCTTATG CCTCAGTCTC CTGAGCAGCT GGGATTACAG GCAAACGCCA 4380 
CCAC ACCCAG CTAAATTTTT TTTTTTTTTT TGTATTTTTA GTAGAGACAG GGTTTCACCA 4440 
TGTTGGTCAG GCGGGTCTCG AACTCCTGAC CTCGTGATCC ACCTGCCTCG GCCTTCCAAA 4500 
GTGCTGGGAT TACAAGTTTG AGCCACTGCA CCTGGCTAAC TTGCCCTATT TTAAAGTCAA 4560 

70 GCAATGGGAA GAATAACAAG ATTATATAGT AATCAGTTTC ATGACACTAA AAGTCATATA 4620 
GTCATAGGGT TTTTTCATCT TTCATATCTT TGCCTAAATT CATTTGCTAC AGTGCAGGAA 4680 
CCAAAACTTG TTCATCTCAT GATTCCCTAC ATCTG AC ATA AGGAAAGTAA GTGCTCAGAA 4740 
AAATGTGCAG GTCA ATA AGT TGCAAAAGTT GGGGCTGCAA TTAATGCTAA CATAAGAGCT 4800 
AAATGCTTG A TTAG AA ATG A TCTCAAAACC TTTTAG AATT TCC AAAATCT TC ATATTACT 4860 

75 GAAACTGTCG GAATATATGG GTCCTGAAAT TCAG AAG ATG ATAGTCACTC TTCCCATATT 4920 
TATAGGCTAT TAAGGCAAGG GATATCTTA A ACATCATATT ACTTTATTTA GATTTCTACT 4980 
ACTCCAATTA TTA ATGTTAT GTATTTCTCA TTGTTTTACT TCTTCATGGT ATTATG AAGA 5040 
CTATATAGAT GATTCAACCA AGCCTGCAAA TCTCCCTCTT GTGGAATTCC ACTGG ACCCA 5100 
ATCTGTTTTC CATTTCCATT GCAATACTAC TAAAGCCATA CAATATCAAG CACCCTCCCT 5160 
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CTAGGTCCAG GGACTATCAC AGAAGAAGCA GGCATGTAAG ATTTTAAGGA CTGGTTTCGA 5220 
GGGGTCGAGT GTAGGAAAAC AGCCTGTTGC ATTGTAAGAG TGATGTCACC TTGAAGAGC A 5280 
GCTGGCATGA TGACTGCTGT TTGACTCCTG CATACCAAGA TATTCTGCAG CAATGTCTTT 5340 
AAACAGTGCC GGTAGTACAG ATAACCCCTC ATAAAGATGC TTATCTAACC TCCCCAGTGT 5400 
5 TCAGGTGTTT CACAAGAAAG TCTGAG ATAT GACTAGCTAC ACGTTTTGCC AAAAATGCTT 5460 
GTTATATAAA GGGTACTTTT GGGAGGGTGA GTGCCGCCAT TTAGTGGCTG CTAGAAACAT 5520 
TGCTTCTGTT TGTAAGTTCC TATTAAATGT TCTTTCTGAG AAAAAAAAAA A 

10 SEQ ID NO:259 PBM4 Protein seouence: 
PBM4 Protein sequence: BAB67786 

MDTVMKQTHA DTPVDHCLSG 1RKCSSTFKL KSEVNKHETA LEMQNPNLNN KECCFTFTLN 60 
. GNSRKLDRSV FTAYGKPSES IYSALSANDY FSERIKNQFN KNIIVYEEKT 1DGHINLGMP 120 

1 5 LKCLPSDSHF KITFGQRKSS KEDGHILRQC ENPNMECILF HVV AIGRTRK KIVKINELHE 180 

KGSKLCIYAL KGE77EGALC KDGRFRSDIG EFEWKLKEGH KKIYGKQSMV DEVSGKVLEM 240 
DISKKKALQQ KDIHKKIKQN ESATDEINHQ SLIQSKKKVH KPKKDGETKD VEHSREQILP 300 
PQDLSHYIKD KTRQTIPRIR NYYFCSLPRK YRQINSQVRR RPHLGRRYAI NLDVQKEAIN 360 
LLKNYQTLNE AIMHQYPNFK EEAQWVRKYF REEQKRMNLS PAKQFNIYKK DFGKMTANSV 420 
20 S V ATCEQLTY YSKS VGFMQW DNNGNTGNAT CFVFNGGYZF TCRHWHLM V GKNTHPSLWP 480 
DIISKCAKVT FTYTEFCPTP DNWFSLEPWL KVSNENLDYA ILKLKENGNA FPPGLWRQIS 540 
PQPSTGLIYL IGHPEGQIKK IDGCTVIPLN ERLKKYPNDC QDGLVDLYDT TSNVYCMFTQ 600 
RSFLSEVWNT HTLSYDTCFS DGSSGSPVFN ASGKLVALHT FGLFYQRGFN VHALIEFGYS 660 
MDSILCDIKK TNESLYKSLN DEKLETYDEE KARPRPAYRR LGCFRFRSRF PILGTGETGR 720 
.525 IEAGKDRRGH GVSETGSCSR RQGGALWVSP AQPIGFRSSW SSGAFASSNT SGNCVERWIP 780 

GRVLARRAVS KEQQNNCSTS LMRMESRGDP RATTNTQAQR FHSPKKNPED QTMPQNRTIY 840 
- i! VTLKA VRKEI ETHQGQEMLV RGTEGIKEYI NLGMPLSCFP EGGQVVITFS QSKSKQKEDN 900 

HIFGRQDKAS TECVKFYIHA IGIGKCKRRI VKCGKLHKKG RKLCVYAFKG ETIKDALCKD 960 
t _ _ GRFLSFLEND DWKLIENNDT ILESTQPVDE LEGRYFQVEV EKRMVPS AA A SQNPESEKRN 1020 
4 30 TCVLREQIVA QYPSLKRESE KIIENFKKKM KVKNGETLFE LHRTTFGKVT KNSSSIKVVK 1080 
LLVRLSDSVG YLFWDSATTG YATCFVFKGL FILTCRHVID SIVGDGIEPS KWATIIGQCV 1 140 
! R VTFG YEELK DKETNYFFVE PW FEIHNEEL DYAVLKLKEN GQQVPMELYN GITPVPLSGL 1 200 

I IHllGHPYGE KKQ1DACA VI PQGQRAKKCQ ERVQSKKAES PEYVHMYTQR SFQKIVHNPD 1260 

i VITYDTEFFF GASGSPVFDS KGSLVAMHAA GFAYTYQNET RSI1EFGSTM ESILLDIKQR 1320 

■ 3 5 HKPWYEEVFV NQQD V EM MS D EDL 

SEQ ID NO:260 PBQ1 DNA sequence 
, Nucleic Acid Accession*: NMJJ15642 

40 Coding sequence: 489-2489 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

A< 1 1 1 1 1 1 

4j ACATTTCAAA AAAAATACAT AGACTGATGT TTCAGACTTG TGCAGCATAA GCCTACAGGG 60 

TACGAAGAAT G AACTC TG AG AATGTTTGGA GAATGTTTCA TCATTACTAA CAGGATATTC 120 

CTCATGACAT TGCTGTCTGA TCTTTGACCA TCAGTCTGTG ACCTGCCCCT TCTCTTTACA 180 

TGCAGCCGCT CTCTGCTCCC TGCCCCAATG AACATCTGCA CTAGGCCCAA GCCTTGGAGT 240 

AATTTACCTG AAGAGTGACA CCATTGATTT TGAAACTACT GAAGAAACCC AAGACAGCTG 300 

50 AAAACCAGAA GGCATCTGAG GAGAATGAGA TTACTCAGCC GGGTGGATCC AGCGCCAAGC 360 

CGGGCCTTCC CTGCCTGAAC TTTGAAGCTG TTTTGTCTCC AGACCCAGCC CTCATCCACT 420 

CAACACATTC ACTGACAAAC TCTCACGCTC ACACCGGGTC ATCTGATTGT GACATCAGTT 480 

GCAAGGGGAT GA CCGAGCGC ATTCACAGCA TCAACCTTCA CAACTTCAGC AATTCCGTGC 540 

TCGAGACCCT CAACGAGCAG CGCAACCGTG GCCACTTCTG TGACGTAACG GTGCGCATCC 600 

55 ACGGGAGCAT GCTGCGCGCA CACCGCTGCG TGCTGGCAGC CGGCAGCCCC TTCTTCCAGG 660 

ACAAACTGCT GCTTGGCTAC AGCGACATCG AGATCCCGTC GGTGGTGTCA GTGCAGTCAG 720 

TGCAAAAGCT CATTGACTTC ATGTACAGCG GCGTGCTACG GGTCTCGCAG TCGGAAGCTC 780 

TGCAGATCCT CACGGCCGCC AGCATCCTGC AGATCAAAAC AGTCATCGAC GAGTGCACGC 840 

GCATCGTGTC ACAGAACGTG GGCGATGTGT TCCCGGGGAT CCAGGACTCG GGCCAGGACA 900 

60 CGCCGCGGGG CACTCCCGAG TCAGGCACGT CAGGCCAGAG CAGCGACACG GAGTCGGGCT 960 

ACCTGCAGAG CCACCCACAG CACAGCGTGG ACAGGATCTA CTCGGCACTC TACGCGTGCT 1020 

CCATGCAGAA TGGCAGCGGC GAGCGCTCTT TTTACAGCGG CGCAGTGGTC AGCCACCACG 1080 

AGACTGCGCT CGGCCTGCCC CGCGACCACC ACATGGAAGA CCCCAGCTGG ATCACACGCA 1140 

TCCATGAGCG CTCGCAGCAG ATGGAGCGCT ACCTGTCCAC CACCCCCGAG ACCACGCACT 1200 

OJ GCCGCAAGCA GCCCCGGCCT GTGCGCATCC AGACCCTAGT GGGCAACATC CACATCAAGC 1260 

AGGAGATGGA GGACGATTAC GACTACTACG GGCAGCAAAG GGTGCAGATC CTGGAACGCA 1320 

ACGAATCCGA GGAGTGCACG GAAGACACAG ACCAGGCCGA GGGCACCGAG AGTGAGCCCA 1380 

AAGGTGAAAG CTTCGACTCG GGCGTCAGCT CCTCCATAGG CACCGAGCCT GACTCGGTGG 1440 

AGCAGCAGTT TGGGCCTGGG GCGGCGCGGG ACAGCCAGGC TGAACCCACC CAACCCGAGC 1500 

70 AGGCTGCAGA AGCCCCCGCT GAGGGTGGTC CGCAGACAAA CCAGCTAGAA ACAGGTGCTT 1560 

CCTCTCCGGA GAGAAGCAAT GAAGTGGAGA TGGACAGCAC TGTTATCACT GTCAGCAACA 1620 

GCTCCGACAA GAGCGTCCTA CAACAGCCTT CGGTCAACAC GTCCATCGGG CAGCCATTGC 1680 

CAAGTACCCA GCTCTACTTA CGCCAGACAG AAACCCTCAC CAGCAACCTG AGGATGCCTC 1740 

TGACCTTGAC CAGCAACACG CAGGTCATTG GCACAGCTGG CAACACCTAC CTGCCAGCCC 1800 

75 TCTTCACTAC CCAGCCCGCG GGCAGTGGCC CCAAGCCTTT CCTCTTCAGC CTGCCACAGC 1860 

CCCTGGCAGG CCAGCAGACC CAGTTTGTGA CAGTGTCCCA GCCCGGTCTG TCGACCTTTA 1920 

CTGCACAGCT GCCAGCGCCA CAGCCCCTGG CCTCATCCGC AGGCCACAGC ACAGCCAGTG 1980 

GGCAAGGCGA AAAAAAGCCT TATGAGTGCA CTCTCTGCAA CAAGACTTTC ACCGCCAAAC 2040 

AGAACTACGT CAAGCACATG TTCGTACACA CAGGTGAGAA GCCCCACCAA TGCAGCATCT 2100 

80 GTTGGCGCTC CTTCTCCTTA AAGGATTACC TTATCAAGCA CATGGTGACA CACACAGGAG 2160 
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TGAGGGCATA CCAGTGTAGT ATCTGCAACA AGCGCTTCAC CCAGAAGAGC TCCCTCAACG 2220 

TGCACATGCG CCTCCACCGG GGAGAGAAGT CCTACGAGTG CTACATCTGC AAAAAGAAGT 2280 

TCTCTCACAA GACCCTCCTG GAGCGACACG TGGCCCTGCA CAGTGCCAGC AATGGGACCC 2340 

CCCCTGCAGG CACACCCCCA GGTGCCCGCG CTGGCCCCCC AGGCGTGGTG GCCTGCACGG 2400 

AGGGGACCAC TTACGTCTGC TCCGTCTGCC CAGCAAAGTT TGACCAAATC GAGCAGTTCA 2460 

ACGACCACAT GAGGATGCAT GTGTCTGACG GATAAGTAGT ATCTTTCTCT CTTTCTTATG 2520 

AACAAAACAA AACAACAACA AAAAACAAAC AAACAAAAAA GCTATGGCAC TAGAATTTAA 2580 

GAAATGTTTT GGTTTCATTT TTACTTTCTG TTTTTGTTTT TGTTTCGTTT CATTTTGTAC 2640 

TACATGAAGA ACTGTTTTTT GCCTGCTGGT ACATTACATT TCCGGAGGCT TGGGTGAATA 2700 

ATAGTTTTCC CAGTCTCCCT CGGATGGTGG CCTTAAGGCC TGGTAGTGCT TCAAGAGGTC 2760 

CACTGGTTGG ATCTCTAGCT ACTGGCCTCT AAATACAACC CTTCTTTACA AAAAAAAAAA 2820 
AAAAAAAAA 



SEQ ID NO;261 PBQ1 Protein sequence: 
PBQ1 Protein sequence: NP_056457 

MTERIHSINL HNFSNS VLET LNEQRNRGHF CDVTVRIHGS MLRAHRCVLA AGSPFFQDKL 60 
LLGYSDIEIP S VVSVQS VQK LIDFMYSGVL RVSQSEALQI LTAASILQIK TVIDECTRIV 120 
SQNVGDVFPG IQDSGQDTPR GTPESGTSGQ SSDTESGYLQ SHPQHSVDRI YSALYACSMQ 180 
NGSGERSFYS GAVVSHHETA LGLPRDHHME DPSWITRIHE RSQQMERYLS TTPETTHCRK 240 
QPRPVRIQTL VGNIHIKQEM EDDYDYYGQQ RVQILERNES EECTEDTDQA EGTESEPKGE 300 
SFDSGVSSSI GTEPDS VEQQ FGPGAARDSQ AEPTQPEQAA EAPAEGGPQT NQLETGASSP 360 
ERSNEVEMDS TVITVSNSSD KSVLQQPSVN TSIGQPLPST QLYLRQTETL TSNLRMPLTL 420 
TSNTQVIGTA GNTYLPALFT TQPAGSGPKP FLFSLPQPLA GQQTQFVTVS QPGLSTFTAQ 480 
LPAPQPLASS AGHSTASGQG EKKPYECTLC NKTFTAKQNY VKHMFVHTGE KPHQCSICWR 540 
SFSLKDYLIK HMVTHTGVRA YQCSICNKRF TQKSSLNVHM RLHRGEKS YE CYICKKKFSH 600 
KTLLERHVAL HSASNGTPPA GTPPGARAGP PGVVACTEGT TYVCSVCPAK FDQIEQFNDH 660 
MRMHVSDG 



SEQ ID NO: 262 PBQ6 DNA sequence 
Nucleic Acid Accessions: AI654187 

Coding sequence: 1 -912 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I I I 1 1 

ATGG TGGAAG AGGAAACAGG CATATCTTAC ATGGTGGCAG ACAAGGGACA CCCTTCTACA 60 

AACTCTACCA CTTCTGCGCC GTCGTTTCGA CCATATAAAA ACGACCTATG CGAACTGCGT 120 

CGGAAAACTC CCTCACGATG TAAAACGAAG ATCAGGAGCA GATTTGAAGA ATTACAAAGT 180 

GAATTGGTGC CAGTCAGCAT GTCAGAGACA GACCACATAG CCTCTACTTC CTCTGATAAA 240 

AATGTTGGGA AAACACCTGA ATTAAAGGAA GACTCATGCA ACTTGTTTTC TGGCAATGAA 300 

AGCAGCAAAT TAGAAAATGA GTCCAAACTA TTGTCATTAA ACACTGATAA AACTTTATGT 360 

CAACCTAATG AGCATAATAA TCGAATTGAA GCCCAGGAAA ATTATATTCC AGATCATGGT 420 

GGAGGTGAGG ATTCTTGTGC CAAAACAGAC ACAGGCTCAG AAAATTCTGA ACAAATAGCT 480 

AATTTTCCTA GTGGAAATTT TGCTAAACAT ATTTCAAAAA CAAATGAAAC AGAACAGAAA 540 

GTAACACAAA TATTGGTGGA ATTAAGGTCA TCTACATTTC CAGAATCAGC TAATGAAAAG 600 

ACTTATTCAG AAAGCCCCTA TGATACAGAC TGCACCAAGA AATTTATTTC AAAAATAAAG 660 

AGCGTTTCAG CATCAGAGGA TTTGTTGGAA GAAATAGAAT CTGAGCTCTT ATCTACGGAG 720 

TTTGCAGAAC ATCGAGTACC AAATGGAATG AATAAGGGAG AACATGCATT AGTTCTGTTT 780 

GAAAAGTGTG TGCAAGATAA ATATTTGCAG CAGGAACATA TCATAAAAAA GGCCAGACTT 840 

GGTCTCTGTT ATTTGCCATC AAGAACCTCA ATTGACACGT TAATTCCGTT TATCCCAAAT 900 
TTATATAGAT AA 



$EQ ID NQ:26$ PBQ$ Prpfcin sequence: 
Prolein Accession #: NP_060170 

MEPKEATGKE NMVTKKKNLA FLRSRLYMLE RRKTDTVVES SVSGDHSGTL RRSQSDRTEY 60 
NQKLQEKMTP QGECS V AETL TPEEEHHMKR MMAKREKIIK ELIQTEKDYL NDLELCVREV 1 20 
VQPLRNKKTD RLDVDSLFSN IES VHQIS AK LLSLLEEATT D VEPAMQVJG EVFLQIKGPL 1 80 
EDIYKIYCYH HDEAHSILES YEKEEELKEH LSHCIQSLK 



SEQ ID NO:264 PBY7 DNA sequence 
Nucleic Acid Accession*: NM_014323 

Coding sequence: 662-2725 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

i i I I i ! 

GGGCCTACTC TGCCGCCGCC GCCGCCCGCC CGCTCCAGCC GCCGCCGCCG CCGCCACCGC 60 

CCTCCAGGCT CCGGGACCCG GCCCGCGCCA CCGCCCCCGT GCGCGCCCCG CCGCCGCCGC 120 

CTTCGCCTTC GCCTTTTGTT TCCTCCGCTC CGGCGCCCCC GCCCCGGCTC GCGCTTTGCA 180 

GGGGACGCAG CGCGCGCCCC CAGCGGGCCC GGGAAAAGCC GCGGCGCGCG CGCGCGCCTG 240 

CGCGGCGGAC CCCTCCTTCT CCTCCCCGCG TGC GCGTGCC CTTCTTGGCT GCGCGCCGGC 300 

GCCGCCTGGG GGGCGGGAGG GGAGGTGGCA GGCGCGTTTG CAGGAGGGGC GCACCTCTTC 360 

GCTCGCGCAC CCCCCCGGAA GGTAGACCGG GAAGGGGAGG CGGGCGGGCG GAGAGGAGAG 420 

AGTGGCGCGC AGTCCAGCGA GGGCGGGGGT TGGCTATGTG GGGGGTGGTG CACCCCGCAG 480 

TCTAGACAGT CTGATCCGGG CTGGGGGCGT GTACACTCGG CGCACCTGCG AGACTACAGA 540 

GCCTCGGGCC GGCACGTGTG GGGAGTGTGG ACACGTCTGC TGCGCCCCGC TTCTCGCTGC 600 
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TGAGGGGAAG 
CATGGAGCGG 
CAGACACAGC 
CTGCGACGTG 
CGCCTGCAGC 
GGACGGGGGT 
CAGCCGGGAG 
CGCCTACACT 
CAAGTTCCTG 
CGTACAGATC 
CTCGGACTTG 
TGGCATCGCC 
TGCAGGCCAA 
ACCCCTATCC 
CCTGACTGGC 
TGGGTCCCCA 
GTTCACTGAT 
GCTGGGCTAC 
AGACCCCGAC 
CGGCAAGATC 
GAAGCCCTAC 
CCATGTGCGG 
AGGCTTCTCC 
GCCTCACAAG 
CCTGGCCTGT 
ATACATGGCA 
TAACCGAGGT 
TCCCCTTCCC 
CTGCGCCAGG 
GAGCTCTGAC 
GAGTGCCAAT 
TGGGGAGAAG 
GAACAAACAC 
CCCTGCCCTT 
GTTTCAGATT 
GCCCATGGGG 
GGGACTGCTG 
AGATTTTTAT 
TTCTCCCAAT 
ACTTGGTATG 
GTTTCTTTAA 
ATACCCAAAT 
TTCGTCATCC 
TACAATGCGG 
GAGTGTCCTC 
GAGC CCTGCT 
CCCACCCCAA 
TTCTCTAATT 
TGAAAGCTAT 
TATTAAACTT 
GGAAGAAATA 
TTTCAATGCT 
TCAGTTGTGT 
GACTGTATTA 



GGAGGGGGCG 
GTGAACGACG 
ACGGAGATGC 
CTCTTGCGGG 
GAGTACTTTG 
CCGGCTGATG 
CTGGAGATGC 
TCCCGCATCG 
CTGATGAGGT 
CTGGTACCCC 
GGCTTCCCTT 
GGCAGCATGC 
GCCTCTTTGC 
CCCCAACTGC 
AAGCGAGGCC 
GGGGGCCTGA 
GCCAACCGGC 
ATCGACCTTC 
GGCCCCCGAA 
TTCCGTGATG 
TCCTGCCCTG 
TCCCATGATG 
AGGCCTGATC 
TGTCAGACCT 
CATGAAGACA 
GACCACCTGA 
TTCTCCTCTG 
CAGGTCTCCA 
ACCTATGGCA 
TCCTATGGTG 
GGCTCTTTCT 
AAGTACCCAT 
ATCCAGAAGG 
GGCTCACCTT 
GTTCAGTCGG 
CCTGAAGGGA 
GGAAATGCTG 
TCATTTTTAA 
GGTCTTTAGA 
GGACAGGGGC 
TGGGAAGAAG 
CTATGATATT 
TCCCTTCCCA 
ATGCCCAACT 
CAAGAGCCCC 
TGGAGGCGAG 
ATTTCAGTTC 
ATTATTATTA 
CCCAGGTGAT 
TGTTTAGATG 
GTTTTATGCA 
GTTGGGAACC 
CACATGTGAG 
AAAATGTTAG 



GGCAGGTGCA 
CTTCGTGCGG 
TCCACAACCT 
TAGGCGACGA 
AGTCGGTGTT 
TAGGGGGCGC 
ACACTATCAG 
TGGTGCGCTT 
CGGTTATCGA 
CTGCCCGCGC 
TGGACATGAC 
AGCCAGAGGA 
CTGTGTTACC 
TGACTTCCCC 
GGGGCCGCCC 
GGGAGGCAGG 
TCCGGCAGCA 
CTCCTCCGAG 
AGAGGAGCCG 
TGTATCATCT 
TGTGTGGGTT 
GGTCCGTGGG 
ACTTGAACGG 
GCAATGCTTC 
AGGTGCCCTG 
AGAAGCACAG 
CCTCCTACTT 
GGCACCAGGA 
ACAAAGAAGG 
ACCTCTCAGA 
CCTGCGACAT 
GCCCTGAATG 
TGCATGTCCG 
TCTCTCCTCA 
CATTTGCGTC 
AATGAGGCAG 
TGAATGCGGA 
CTGCCCCCCA 
AATAGATTTT 
AGAAAACACT 
CTGGAATTCC 
CTGGGACCTC 
TATCCTTCAA 
GTTTTTAAGG 
CTGAGCTCAG 
CATTTTCACT 
TTACGTGATT 
TTGTTATTAT 
ACAGAGCTCT 
TACCATAATT 
AAATTTTAAA 
AGGAAGGTGG 
CAAGCCCAGG 
TACATTACTC 



GCGGCCGGGC 
CCCGTCTGGC 
GAACCAGCAG 
GAGCTTCCCA 
CAGCGCCCAG 
GAC GGCAGC A 
CTCCAAGGTA 
GGAGAGCTTT 
GATCTGCCAG 
CGATATAATG 
CAACGGGGCA 
GGAGGCAGCT 
TGGGGTGGAC 
ATTCCCCAGT 
AAGGAAGGCC 
CATCCTTCCA 
CGAGGCCCAG 
GCTGGGTGAG 
GACCAGGAAG 
TAACCGGCAC 
GCGGTTCAAG 
CAAGCCTTAC 
ACATATCAAG 
TTTTGCCACC 
CCAGGTGTGT 
CGAGGGGCCC 
AAAGGTCCAT 
GCCCATCCTG 
CCAGAAATGC 
TGCCAGCGAC 
GGCAGTCCCC 
TGGGAGCTTC 
GGCTCTCGGG 
GCAGAACATG 
ATCTTTAGTA 
CTGCTGTGTC 
GGGAAGTGAT 
ACCCCACTCC 
CATCTGATAT 
ACATAGGCCT 
TGGTGCTCAA 
AGTGATTTTG 
AAGAACCACA 
AAGCCAGAAG 
CCCTCTGCCT 
GCTAGGACAA 
TTAACCATTC 
TTTTTAGGAC 
TTGTAAACCG 
AACTTGGCTA 
AAATGCCAGT 
GACAGCCGGC 
TTGACCTTGT 
TA 



TAGTGGGAGG 
TGCTACACAT 
CGCAAAAACG 
GCGCACCGCG 
TTGGGCGACG 
CCAGGCGGCG 
TTTGGGGACA 
CCCGAACTCA 
GAAGTCATCA 
CTCTTTCGCC 
GCCTTGGCAG 
CGGGCGGCTG 
CGCTTGCCCA 
GTGGCATCCA 
AACCTGCTGG 
TGCGGTCTAT 
CACGGTGTCA 
AATGGGCTAC 
CAGGTGGCTT 
AAGCTGTCCC 
AGAAAAGACC 
ATCTGCCAGA 
CAGGTGCACA 
CGAGACCGTC 
GGGAAGTACT 
AGCAACTTCT 
GTTAAAACCC 
AATGGGGGAG 
TCACATCAGG 
CTGAAGACGC 
AAAAACAAAA 
TTCCGCTCTA 
GGCCCCCTGG 
TCTCTCCTCG 
GATCCTGAGG 
CCCACGGAAA 
GTTTGGGTTC 
AACTCCTTCT 
TCTGCAGAAA 
CCAAGGCAAA 
TTC TTAGTG A 
GTCCCCTCCC 
CTAGGGTCTC 
CATCCCATGG 
GGAGGGCTCC 
GCTCAGCTGT 
AACATGCTGT 
CAGTTGTAGT 
CAGTCACACA 
GTTGATTGTT 
CTGGTCAGGG 
AGGTAGGGAC 
GATGTGAATT 



GGGCGGCGGC 
ACCAGGTGAG 
GCGGGCGCTT 
CCGTGCTGGC 
GCGGAGCTGC 
GGGCCGGGGG 
TTCTGGACTT 
TGACGGCCGC 
AACAGTCCAA 
CCCCTGGGAC 
CCAACAGCAA 
GTGCAGCCAT 
TGGTGGCTGG 
GTGCCCCTCC 
ACTCAATGTT 
GTGGTAAGGT 
CCAGCCTCCA 
CCATCTCTGA 
GTGAGATCTG 
ACTCTGGGGA 
GCATGTCCTA 
GCTGTGGGAA 
CTTCTGAGCG 
TGCGCTCCCA 
TGCGGGCAGC 
GCAGTATCTG 
ACCACGGTGT 
CAGCGTTCCA 
ATCCGATTGA 
CAGAGAAGCA 
TGGAGTCTGA 
AGTCCTACTT 
GGGACCTGGG 
AGTCCTTTGG 
TTGACCAGCA 
CAACCATCTG 
TGTAGCTGAG 
CCACCACCCA 
TATCAATGAG 
ACCAGTCCCA 
CCCCAATCCT 
ACTTCTCTAG 
CACCTACTTA 
ACCATGGGGT 
AGACCTTTCT 
TGAGGACACC 
TGGGTTTTAA 
GAATTGCTAC 
TTAGGGTTAG 
TGAAGTCTAT 
AAGTAGGGGG 
ATTGTGTACC 
GATCTGATCA 



660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 



SEQ ID NO:265 PBY7 Protein sequence: 
Protein Accession #: NP_1 14439 

MERVNDASCG PSGCYTYQVS RHSTEMLHNL NQQRKNGGRF CDVLLRVGDE SFPAHR AVLA 60 
ACSEYFES VF S AQLGDGG AA DGGPADVGG A TAAPGGG AGG SRELEMHTIS SK VFGDILDF 1 20 
AYTSRIVVRL ES FPELMTA A KFLLMRSVIE ICQEVIKQSN VQILVPPARA DIMLFRPPGT 180 
SDLGFPLDMT NG AALAANSN GIAGSMQPEE EAARAAGAAI AGQASLPVLP GVDRLPMVAG 240 
PLSPQLLTSP FPSVASSAPP LTGKRGRGRP RKANLLDSMF GSPGGLREAG ILPCGLCGKV 300 
FTDANRLRQH EAQHGVTSLQ LGYIDLPPPR LGENGLPISE DPDGPRKRSR TRKQVACEIC 360 
GKIFRD VYHL NRHKJLSHSGE KPYSCPVCGL RFKRKDRMS Y HVRSHDGS VG KPYICQSCGK 420 
GFSRPDHLNG HIKQVHTSER PHKCQTCNAS FATRDRLRSH LACHEDKVPC QVCGKYLRAA 480 
YMADHLKKHS EGPSNFCSIC NREGQKCSHQ DPIESSDSYG DLSDASDLKT PEKQSANGSF 540 
SCDMAVPKNK MESDGEKKYP CPECGSFFRS KSYLNKHIQK VHVRALGGPL GDLGPALGSP 600 
FSPQQNMSLL ESFGFQIVQS AFASSLVDPE VDQQPMGPEG K 



SEQ ID NO: 266 PBY9 DNA sequence 
Nxleic Acid Accession*: NM__0t2429 
75 Coding sequence: 174-1385 (underlined sequence corresponds to start and stop codon) 



80 



1 11 21 31 41 51 

I f I I I I 

CCCTACTCCG CCTCTCGGGA TCCTTTAAGA GGCGGGGCTT GGCTGCCAGC TCCGCGGCCC 60 
GGGCAAAAGG CTGGGACTTT ACTCCGGGTG GCGGCGAGGA CGAGTCTGTG CTCCATCAGC 120 
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TGCCGCACCC GCCGCCTCCC GCCCCCAAAC 
GCAGAGTCGG CGATCTGAGC CCCAGGCAGA 
TCCAGGATGT GCTGCCGGCC CTGCCGAATC 
GAGCCAGAAG CTTCGACCTG CAGAAGTCGG 
5 GAAAGCAAAA GGACATTGAC AACATCATTA 

ATCTGTCAGG GGGTATGTGT GGCTATGACC 
TTGGACCTCT GGATGCCAAG GGTCTGCTGT 
CCAAGATGCG GGAGTGTGAG CTGCTTCTGC 
GGAGGAAGGT GGAGACCATC ACCATAATTT 
10 TCTGGAAGCC TGCTGTGGAG GCCTATGGAG 
CCGAAACACT GAAGCGTCTT TTTGTTGTTA 
ACCTCATCAA ACCCTTCCTG AGTGAGGACA 
ATTGGAAGGA GGTTTTACTG AAACATATCA 
GCACCATGAC TGACCCTGAT GGAAACCCCA 
15 ACATCCCCAG GAAGTATTAT GTGCGAGACC 

AGATTTCCCG TGGCTCCTCC CACCAAGTGG 
TCAGGTGGCA GTTTATGTCA GATGGAGCGG 
AGATGGGAGA GAGGCAGCGG GCAGGGGAGA 
ACTCCCACCT GGTCCCTGAA GATGGGACCC 
20 TGCGGTTTGA CAACACCTAC AGCTTCATTC 
TCCTGCTTCC AGACAAAGCC TCAGAAGAGA 
AATAACACCT TCTCCTATAG CAGGCCTGGC 
CCTTGTAGCA GTCATTTTCG CACAACCCTG 
CCTCAGGAGC TTTCATTTCA GTTAGGCAGA 
25 TATCAAATAC CTAAGGAGTC CCCAGGAGCT 
CTGTAAACTG TGCCAACTTC ACCTGTCCAG 
TGTACCACAG GGTGGCAGCA GGGAAAAAAA 
ACTTCAGGGA AGTCAGCTGC CGGGGAGAAA 
TCGCAATGAG GAGTAGCAGG GTAGCTGGTT 
30 TCCAAACATT TTAGCACTGA GGCTGGGGTA 

GGCCTGAGTC AGCACACATC TTCCCACTCG 
GACTTTGGCA ACTCCTGGGC CACACGGCCT 
CTCAGAGCTT CCTGGGACTT CGGGTACCCA 
GGGAAATGAC CCACAGGGAT CGCAGCTGCA 
35 GAATGCTAAA AGCAGATCGT CCAGTGCCCT 

TCCTCCATGT GAGCAACCCC GAGACAAAAA 
GAGAGGGTGT TTGCCAGTCT GAGTGTCCCG 
CTGAGCAAGG TCTTACTAAG CAGTCCCATC 
GTTCAGGTGC CGGTCGGCGT AGCCAGGCCT 
40 GGCGGGGCCG GCGTCTCGCA GACTAGGGGC 
CAGCCCTTAC CCCAATCCCA CGAGCCCCGC 
ACATGGGAAG GCGGCCCCAG ACCTGGCGGG 
CCCGTCTGGG AAGCTCATCT TGCGAAGCTG 
CGGACCGGAA GGGGCCGAGG CTGCACGGGC 
45 TGGGTTTACA ACGCTGTTAG GAAAATTAAC 



CCCATCCCCG CGGTTGAGCC ACGATGAGCG 180 

AGGAGGCATT GGCCAAGTTT CGGGAGAATG 240 

CAGATGACTA TTTTCTCCTG CGTTGGCTCC 300 

AGGCCATGCT CCGGAAGCAT GTGGAGTTCC 360 

GCTGGCAGCC TCCAGAGGTG ATCCAACAGT 420 

TGGATGGCTG CCCAGTCTGG TACGACATAA 480 

TCTCAGCCTC CAAACAGGAC CTGCTGAGGA 540 

AAGAGTGTGC CCACCAGACC ACAAAGTTGG 600 

ATGAC TGCG A GGGGCTTGGC CTCAAGCATC 660 

AGTTTCTCTG CATGTTTGAG GAAAATTATC 720 

AAGCCCCCAA ACTGTTTCCT GTGGCCTATA 780 

CTCGTAAGAA GATCATGGTC CTGGGAGCAA 840 

GCCCTGACCA GGTGCCTGTG GAGTATGGGG 900 

AGTGCAAATC CAAGATCAAC TACGGGGGTG 960 

AGGTGAAACA GCAGTATGAA CACAGCGTGC 1020 

AGTATGAGAT CCTCTTCCCT GGCTGTGTCC 1080 

ATGTTGGTTT TGGGATTTTC CTGAAGACCA 1140 

TGACAGAGGT GCTGCCCAAC CAGAGGTACA 1200 

TCACCTGCAG TGATCCTGGC ATCTATGTCC 1260 

ATGCCAAGAA GGTCAATTTC ACTGTGGAGG 1320 

AGATGAAACA GCTGGGGGCA GGCACCCCGA 1380 

CCCCTCAGTG TCTCCCTGTC AATTTCTACC 1440 

AAGCCCAAAG AAACTGGGCT GGAGGACAGA 1500 

GGAAGAGCGA CTGCAGTGGG TCTCCGTGTC 1560 

GGCTGGCCAT CGTGATAGGA TCTGTCTGTC 1620 

GGACAGCGAA GCTGGGGGTG GCGGGGGGCA 1680 

TTAGAAAAGG GTGAAAGATT GGGACTTAAC 1740 

CTTGCTCCTA AATGAACACA TAAGTTTAGA 1800 

GCTAGAGTTA CGGTGGGGAT CAGAAACTCT 1860 

GCTTTTGGCT TTTCCCAGGT CTCAGGAGGT 1920 

GTAGACAGGC TGGCCTCTCC CTCACTTTGA 1980 

GCCTCTTTGA TTACTAATGA TTGTCAGTGA 2040 

CCCGCTGTTC TCCATGCAAA CAAAGCGCCA 2100 

GGGAGGGCCA GGGAGGTTGG GGGTGGGAGT 2160 

TTTCAGTGCT ACCGGCCTCT CACCAAGCAG 2220 

TGCTAAGTGG GATCAAGAGA GCAGCACTCG 2280 

CGGTGCCCGC CAACCCGCTT CCTGACTGAC 2340 

TCTGTGGGAG GCATGCAACG CGTGCAGGGA 2400 

GGAGGCCCCC CAGGCAGGAG GCCGCCCAAA 2460 

TGGGGGCGGC CACAGACGGC CTCGAAACCA 2520 

CAACGAACCA CAGGTGCTGG GCTTTAGAGA 2580 

AACGCCTTTC CCTCAGAGCC AGGCCCCGGC 2640 

AGGGAGCTCA GGGCAAAGGC CAGGCTAGCG 2700 

CTCTGCCAGA ACGCTCAGGA CATCCCGGCC 2760 
CAATGAATAA AGCAACGTTC AGTGCGCA 



50 



SEQ ID N0:267 PBY9 Protein sequence: 
Protein Accession #: NPJJ3656 1 



MSGRVGDLSP RQKEALAKFR ENVQDVLPAL PNPDDYFLLR WLRARSFDLQ KSEAMLRKHV 60 
EFRKOKDIDN IISWQPPEVI QQYLSGGMCG YDLDGCPVWY DIIGPLDAKG LLFSASKQDL 120 
LRTKMRECEL LLQECAHQTT KLGRKVETIT IIYDCEGLGL KHLWKPAVEA YGEFLCMFEE 180 
NYPETLKRLF VVKAPKLFPV AYNLIKPFLS EDTRKKIMVL GANWKEVLLK HISPDQVPVE 240 
55 YGGTMTDPDG NPKCKSKJNY GGDIPRKYYV RDQVKQQYEH SVQISRGSSH QVEYEILFPG 300 
CVLRWQFMSD GADVGFGIFL KTKMGERQRA GEMTEVLPNQ RYNSHLVPED GTLTCSDPGI 360 
YVLRFDNTYS FIHAKKVNFT VEVLLPDKAS EEKMKQLGAG TPK 



60 SEQ ID NO:268 PBH8 DNA seouence 
Nucleic Acid Accession*: XM_009756 

Coding sequence: 301-1440 (underlined sequence corresponds to start and stop codon) 

i 11 21 31 41 51 

65 i i I 1 i I 

GTGGGGACAG CCGAGCCGCG CCGGGCCCCT GGACGGCGTC GCCAAGGAGC TGGGATCGCA 60 

CTTGC TGC AG ACTTTGGATG GATTTGTTTT TGTGGTAGCA TCTGATGGCA AAATCATGTA 120 

TATATCCGAG ACCGCTTCTG TCCATTTAGG C TTATC C C AG GTGGAGCTCA CGGGCAACAG 180 

TATTTATGAA TACATCCATC CTTCTGACCA CGATGAGATG ACCGCTGTCC TCACGGCCCA 240 

70 CCAGCCGCTG CACCACCACC TGCTCCAAGG TATGAGATAG AG AGGTC GTT CTTTCTTCGA 300 

ATGAAATGTG TCTTGGCGAA AAGGAACGCG GGCCTGACCT GCAGCGGATA CAAGGTCATC 360 

CACTGCAGTG GCTACTTGAA GATCAGGCAG TATATGCTGG ACATGTCCCT GTACGACTCC 420 

TGCTACCAGA TTGTGGGGCT GGTGGCCGTG GGCCAGTCGC TGCCACCCAG TGCCATCACC 480 

GAGATCAAGC TGTACAGTAA CATGTTCATG TTCAGGGCCA GCCTTGACCT GAAGCTGATA 540 

75 TTCCTGGATT CCAGGGTGAC CGAGGTGACG GGGTACGAGC CGCAGGACCT GATCGAGAAG 600 

ACCCTATACC ATCACGTGCA CGGCTGCGAC GTGTTCCACC TCCGCTACGC ACACCACCTC 660 

CTGTTGGTGA AGGGCCAGGT CACCACCAAG TACTACCGGC TGCTGTCCAA GCGGGGCGGC 720 

TGGGTGTGGG TGCAGAGCTA CGCCACCGTG GTGCACAACA GCCGCTCGTC CCGGCCCCAC 780 

TGCATCGTGA GTGTCAATTA TGTACTCACG GAGATTGAAT ACAAGGAACT TCAGCTGTCC 840 

80 CTGGAGCAGG TGTCCACTGC CAAGTCCCAG GACTCCTGGA GGACCGCCTT GTCTACCTCA 900 
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CAAGAAACTA GGAAATTAGT GAAACCCAAA AATACCAAGA TGAAGACAAA GCTGAGAACA 960 

AACCCTTACC CCCCACAGCA ATACAGCTCG TTCCAAATGG ACAAACTGGA ATGCGGCCAG 1020 

CTCGGAAACT GGAGAGCCAG TCCCCCTGCA AGCGCTGCTG CTCCTCCAGA ACTGCAGCCC 1080 

CACTCAGAAA GCAGTGACCT TCTGTACACG CCATCCTACA GCCTGCCCTT CTCCTACCAT 1140 

5 TACGGACACT TCCCTCTGGA CTCTCACGTC TTCAGCAGCA AAAAGCCAAT GTTGCCGGCC 1200 

AAGTTCGGGC AGCCCCAAGG ATCCCCTTGT GAGGTGGCAC GCTTTTTCCT GAGCACACTG 1260 

CCAGCCAGCG GTGAATGCCA GTGGCATTAT GCCAACCCCC TAGTGCCTAG CAGCTCGTCT 1320 

CCAGCTAAAA ATCCTCCAGA GCCACCGGCG AACACTGCTA GGCACAGCCT GGTGCCAAGC 1380 

TACGAAGGCA AGCAGATGTC CTCTGCGGAG ATACCGCCAG CTCCCCAGGA CGCAGACTGA 1440 
10 CTCCTGTTTG CTCGCTGGAC CAAC 



15 



SEQ ID NO:269 PBH8 Protein sequence: 
Protein Accession #: NP_005060 



MKEKSKNAAK TRREKENGEF YELAKLLPLP S AITSQLDKA SIIRLTTSYL KMRAVFPEGL 60 
GDAWGQPSRA GPLDGVAKEL GSHLLQTLDG FVFVVASDGK IMYISETASV HLGLSQVELT 120 
GNSIYEYIHP SDHDEMTAVL TAHQPLHHHL LQEYEIERSF FLRMKCVLAK RN AGLTCSGY 180 
KVIHCSGYLK IRQYMLDMSL YDSCYQIVGL VAVGQSLPPS AITEIKLYSN MFMFRASLDL 240 
20 KLIFLDSRVT EVTGYEPQDL IEKTLYHHVH GCDVFHLRYA HHLLLVKGQV TTKYYRLLSK 300 
RGGWVWVQSY ATVVHNSRSS RPHCIVS VNY VLTEIEYKEL QLSLEQVSTA KSQDSWRTAL 360 
STSQETRKLV KPKNTKMKTK LRTNPYPPQQ YSSFQMDKLE CGQLGNWRAS PPASAAAPPE 420 
LQPHSESSDL LYTPSYSLPF SYHYGHFPLD SHVFSSKKPM LPAKFGQPQG SPCEVARFFL 480 
STLPASGECQ WHYANPLVPS SSSPAKNPPE PPANTARHSL VPS YEAPAAA VRRFGEDTAP 540 
25 PSFPSCGHYR EEPALGPAKA ARQAARDGAR LALARAAPEC CAPPTPEAPG APAQLPFVLL 600 
NYHRVLARRG PLGGAAPAAS GLACAPGGPE AATGALRLRH PSPAATSPPG APLPHYLGAS 660 
VIITNGR 



30 SEQ ID NO:27Q PB.19 DNA sequence: 

Nucleic Acid Accession*: AA760894 

GGCACGAGG A GAAGATGTGG CTTGCTCATG CTTGACTTCT GCCATGGTTG TGAGGCCTCC 60 
CCAGCCATGT GGAACTGTTT TCAGGTGCTG GTTCCATGGC TCTTCCTGAG CCGAAAATAA 120 

35 GGAAACTCCA TAGACCTTGT CCACTGGAAC TCGTTCCCAT CTACCCTCCA CTCTATCCAG 180 
GGTGATGGAT CTCTGCAGTA AGTGGAAG AG TTCTTCATGG CCCCCAAGGT TATATCCATC 240 
TAGAACTTCA GCACGTAATT TCATCTGGAA ATAGTGCCTT TGTGGATATA AGTT AGGTAA 300 
AACTGAAGAT GAGATCATAC TGGATTAGGA TGGGATCTAA ATCCAATGAA AATGTCTTCA 360 
TAAAAAACAG G A A AG A ACCC ATAGAAACAC AAGGAAGAAG GTCATGTGAA GATGGACMCA 420 

40 GAGATTGGAG GGATGCAGCC ACCGGCCCAG GAATGCCAGC AGCCACCCAG AAGCTGG A AG 480 
GAAATG AGGG ATTCTCTCCT AGAACCTTTA GAGAGRACAT GGTCCTGTGA ACAGCTTGAT 540 
TTTGGACTTG CCCATAGCTT GTATACTCTT ACTTTGGATA CAATTTTATC CAAACTTGGC 600 
TA A AC AGTTT CTCAGCCTAT GG AAAATTTA AAATGGAGAA GATTCAACTC GATTCTTACA 660 
GATTCAAAGC AAGAAAATGA TGGGAACATA GGAGG AG ACC AAGAAAGCCT ATAAAAAGCA 720 

45 AAAATATG AA GTGAACATTG TGGTAGCTTT AAGATGTTTA GTGTAGCTGC AGGCACCCTA 780 
TACACATGAA AACCCCCAAG GGG AATCCCC ATATCACAGT GTAGTGTGAT ATTTGACATT 840 
YGTGATCATY TAGAGATGTA CAGAAAAGGT G A ATCTGTGT TCTGTATATT CTGCCTAAGG 900 
CAAAG AAATG TTTAGCTYTC TTTAAAATAG TTCCATAATT TTTTYTAAAA AGCTTTGCTT 960 
GAAAACTGTA AGCTTCCCAT ATCTGGAGCA TTTCACTTTA AATATTTGGA TAAATATGTT 1020 

50 ATCTTCTTAC TTGGACATTT C ATGTGTTTA GGGATTGTYT TYTAAATTCT TCCTAATTCA 1080 

TATAGCTGCT AACACTTCCC GCAG AGCTAA ACCATTACAG ANTATGAAAT AAAGACCCTA 1140 
TTGATTTGAA CTTAAAAAAA AAAAMAMAAA AAAAAAAAAA AAAAAAAAAT GA 

SEQ ID NO:271 PBQ4 DNA sequence 
55 Nucleic Acid Accession*: AA149579 

Coding sequence: 1-1 363 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

60 ATGGAATCAA TCTCTATGAT GGGAAGCCCT AAGAGCCTTA GTGAAACTTG TTTACCTAAT 60 

GGCATAAATG GTATCAAAGA TGCAAGGAAG GTCACTGTAG GTGTGATTGG AAGTGGAGAT 120 

TTTGCCAAAT CCTTGACCAT TCGACTTATT AGATGCGGCT ATCATGTGGT CATAGGAAGT 180 

AGAAATCCTA AGTTTGCTTC TGAATTTTTT CCTCATGTGG TAGATGTCAC TCATCATGAA 240 

GATGCTCTCA CAAAAACAAA TATAATATTT GTTGCTATAC ACAGAGAACA TTATACCTCC 300 

65 CTGTGGGACC TGAGACATCT GCTTGTGGGT AAAATCCTGA TTGATGTGAG CAATAACATG 360 

AGGATAAACC AGTACCCAGA ATCCAATGCT GAATATTTGG CTTCATTATT CCCAGATTCT 420 

TTGATTGTCA AAGGATTTAA TGTTGTCTCA GCTTGGGCAC TTCAGTTAGG ACCTAAGGAT 480 

GCCAGCCGGC AGGTTTATAT ATGCAGCAAC AATATTCAAG CGCGACAACA GGTTATTGAA 540 

CTTGCCCGCC AGTTGAATTT CATTCCCATT GACTTGGGAT CCTTATCATC AGCCAGAGAG 600 

70 ATTGAAAATT TACCCCTACG ACTCTTTACT CTCTGGAGAG GGCCAGTGGT GGTAGCTATA 660 

AGCTTGGCCA CATTTTTTTT CCTTTATTCC TTTGTCAGAG ATGTGATTCA TCCATATGCT 720 

AGAAACCAAC AGAGTGACTT TTACAAAATT CCTATAGAGA TTGTGAATAA AACCTTACCT 780 

ATAGTTGCCA TTACTTTGCT CTCCCTAGTA TACCTCGCAG GTCTTCTGGC AGCTGCTTAT 840 

CAACTTTATT ACGGCACCAA GTATAGGAGA TTTCCACCTT GGTTGGAAAC CTGGTTACAG 900 

75 TGTAGAAAAC AGCTTGGATT ACTAAGTTTT TTCTTCGCTA TGGTCCATGT TGCCTACAGC 960 

CTCTGCTTAC CGATGAGAAG GTCAGAGAGA TATTTGTTTC TCAACATGGC TTATCAGCAG 1020 

GTTCATGCAA ATATTGAAAA CTCTTGGAAT GAGGAAGAAG TTTGGAGAAT TGAAATGTAT 1080 

ATCTCCTTTG GCATAATGAG CCTTGGCTTA CTTTCCCTCC TGGCAGTCAC TTCTATCCCT 1140 

TCAGTGAGCA ATGCTTTAAA CTGGAGAGAA TTCAGTTTTA TTCAGTCTAC ACTTGGATAT 1200 

415 



GTCGCTCTGC TCATAAGTAC TTTCCATGTT TTAATTTATG GATGGAAACG AGCTTTTGAG 1260 
GAAGAGTACT ACAGATTTTA TACACCACCA AACTTTGTTC TTGCTCTTGT TTTGCCCTCA 1320 
ATTGTAATTC TGGATCTTTT GCAGCTTTGC AGATACCCAG ACTGA 

SEQ ID NO:272 PBQ4 Protein sequence: 
Protein Accession #: none 



1 11 21 31 41 51 

I I I I I i 

MESISMMGSP KSLSETCLPN GINGIKDARK VTVGVIGSGD FAKSLTIRLI RCGYHWIGS 60 

RNPKFASEFF PHWDVTHHE DALTKTNI IF VAIHREHYTS LWDLRHLLVG KILIDVSNNM 120 

RINQYPESNA EYLASLFPDS L I VKGFNWS AWALQLGPKD ASRQVYICSN NIQARQQVIE 180 

LARQLNFIPI DLGSLSSARE IENLPLRLFT LWRGPVWAI SLATFFFLYS FVRDVIHPYA 240 

RNQQSDFYKI PIEIVNKTLP IVAITLLSLV YLAGLLAAAY QLYYGTKYRR FFPWLETWLQ 300 

CRKQLGLLSF FFAMVHVAYS LCLPMRRSER YLFLNMAYQQ VHANIENSWN EEEVWRIEMY 360 

ISFGIMSLGL LSLLAVTSIP SVSNALNWRE FSFIQSTLGY VALLISTFHV LIYGWKRAFE 420 
EEYYRFYTPP NFVLALVLPS IVILDLLQLC RYPD 

SEQ ID NO:273 PBQ5 DNA SEQUENCE 

Nucleic Acid Accession*: NM_001973 

Coding sequence: 150-1445 (underlined sequence corresponds to start and stop codon) 

l 11 21 31 41 51 

j | 1 j i i 

CCGCCGCCTT CTACTCCGCC GCGGGGGTCG CAGCGGCTGC CGCGCCGTCC TCGAGTTTCC 60 

AGCGTGAGGA GGAGGCTGAG GGCGGAGAGG CGCATCGTGT TCGAGGCGGA GACCGAGGGG 120 

GAGCCCCGCG CGCGGCGTCG CTCATTGC TA TGGACAGTGC TATCACCCTG TGGCAGTTCC 180 

TTCTTCAGCT CCTGCAGAAG CCTCAGAACA AGCACATGAT CTGTTGGACC TCTAATGATG 240 

GGCAGTTTAA GCTTTTGCAG GCAGAAGAGG TGGCTCGTCT CTGGGGGATT CGCAAGAACA 300 

AGCCTAACAT GAATTATGAC AAACTCAGCC GAGCCCTCAG ATACTATTAT GTAAAGAATA 360 

TCATCAAAAA AGTGAATGGT CAGAAGTTTG TGTACAAGTT TGTCTCTTAT CCAGAGATTT 420 

TGAACATGGA TCCAATGACA GTGGGCAGGA TTGAGGGTGA CTGTGAAAGT TTAAACTTCA 480 

GTGAAGTCAG CAGCAGTTCC AAAGATGTGG AGAATGGAGG GAAAGATAAA CCACCTCAGC 540 

CTGGTGCCAA GACCTCTAGC CGCAATGACT ACATACACTC TGGCTTATAT TCTTCATTTA 600 

CTCTCAACTC TTTGAACTCC TCCAATGTAA AGCTTTTCAA ATTGATAAAG ACTGAGAATC 660 

CAGCCGAGAA ACTGGCAGAG AAAAAATCTC CTCAGGAGCC CACACCATCT GTCATCAAAT 720 

TTGTCACGAC ACCTTCCAAA AAGCCACCAG TTGAACCTGT TGCTGCCACC ATTTCAATTG 780 

GCCCAAGTAT TTCTCCATCT TCAGAAGAAA CTATCCAAGC TTTGGAGACA TTGGTTTCCC 840 

CAAAACTGCC TTCCCTGGAA GCCCCAACCT CTGCC TCTAA CGTAATGACT GCTTTTGCCA 900 

CCACACCACC CATTTCGTCC ATACCCCCTT TGCAGGAACC TCCCAGAACA CCTTCACCAC 960 

CACTGAGTTC TCACCCAGAC ATCGACACAG ACATTGATTC AGTGGCTTCT CAGCCAATGG 1020 

AACTTCCAGA GAATTTGTCT CTGGAGCCTA AAGACCAGGA TTCAGTCTTG CTAGAAAAGG 1080 

ACAAAGTAAA TAATTCATCA AGATCCAAGA AACCCAAAGG GTTAGGACTG GCACCCACCC 1140 

TTGTGATCAC GAGCAGTGAT CCAAGCCCAC TGGGAATACT GAGCCCATCT CTCCCTACAG 1200 

CTTCTCTTAC ACCAGCATTT TTTTCACAGA CACCCATCAT ACTGACTCCA AGCCCCTTGC 1260 

TCTCCAGTAT CCACTTCTGG AGTACTCTCA GTCCTGTTGC TCCCCTAAGT CCAGCCAGAC 1320 

TGCAAGGTGC TAACACACTT TTCCAGTTTC CTTCTGTACT GAACAGTCAT GGGCCATTCA 1380 

CTCTGTCTGG GCTGGATGGA CCTTCCACCC CTGGCCCATT TTCCCCAGAC CTACAGAAGA 1440 

CATAACCTAT GCACTTGTGG AATGAGAGAA CCGAGGAACG AAGAAACAGA CATTCAACAT 1500 

GATTGCATTT GAAGTGAGCA ATTGATAGTT CTACAATGCT GATAATAGAC TATTGTGATT 1560 

TTTGCCATTC CCCATTGAAA ACATCTTTTT AGGATTCTCT TTGAATAGGA CTCAAGTTGG 1620 

ACTATATGTA TAAAAATGCC TTAATTGGAG TCTAAACTCC ACCTCCCTCT GTCTTTTCCT 1680 

TTTCTTTTTC TTTCCTTCCT TCCTTTTCTT TTCTCCTTTA AAAATATTTT GAGCTTTGTG 1740 

CTGAAGAAGT TTTTGGTGGG CTTTAGTGAC TGTGCTTTGC AAAAGCAATT AAGAACAAAG 1800 

TTACTCCTTC TGGCTATTGG GACCCTTTGG CCAGGAAAAA TTATGCTTAG AATCTATTAT 1860 

TTAAAGAAGT ATTTGTGAAA TGAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1920 
AAAAAAAAAA AAA 



SEQ ID NO:274 PBQ5 Protein sequence: 
Protein Accession*: NP_001964 

MDSAITLWQF LLQLLQKPQN KHMICWTSND GQFKLLQAEE VARLWGIRKN KPNMN YD KLS 60 
RALRYYYVKN 1IKKVNGQKF VYKFVSYPEI LNMDPMTVGR IEGDCESLNF SEVSSSSKDV 120 
ENGGKDKPPQ PGAKTSSRND YIHSGLYSSF TLNSLNSSNV KLFKLIKTEN PAEKLAEKKS 180 
PQEPTPSVIK FVTTPSKKPP VEPVAATISI GPSISPSSEE TIQALETLVS PKLPS LEAPT 240 
SASNVMTAFA TTPPISSIPP LQEPPRTPSP PLSSHPDIDT DIDSVASQPM ELPENLSLEP 300 
KDQDSVLLEK DKVNNSSRSK KPKGLGLAPT LVITSSDPSP LGILSPSLPT ASLTPAFFSQ 360 
TPI1LTPSPL LSSIHFWSTL SPV APLSPAR LQGANTLFQF PS VLNSHGPF TLSGLDGPST 420 
PGPFSPDLQK T 

SEQ ID NO:275 P6Y3 DNA SEQUENCE 

Nucleic Add Accession*: AB040921 

Coding sequence: 131-2560 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 
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10 
15 
20 
25 
30 
35 
40 
45 
50 
55 
60 
65 
70 



AATCAGGAAC 
AGATGGAACT 
GTATATTGAA 
GGTAAATTTA 
AACCACTCAA 
TTGCAGAATA 
AGCTGCAGAA 
CCAGAGTCGG 
TCAGTGGCTC 
CCATGAAAGA 
TCGATCTGAC 
ATATTTTGGT 
TCTTTTGGAA 
CCAGTTTAAG 
AGCAATATAT 
AAGTACTGTA 
TGCCCTCATC 
AGGCTGGGAC 
AGATAAATTT 
GTTTAAAAGA 
TAGCATTACC 
TTTTGATACT 
CAAACAGAGA 
TGGTCTTAGA 
GGAAGAACTT 
TAGATTAATG 
GCTGAACGCT 
ACCCGTTGAG 
CCCAGTACTC 
AAAAGAAAAG 
CTTAACAGTT 
CGAAAAGGAC 
CATGAAAGGA 
TAAAGATCCA 
TGCTGGTTTA 
GGTAAAAGTT 
GGAGCAAACA 
TATATACTTG 
CATTTCCATC 
TCAGTCTCCA 
TCTGCAAGAG 
CTGTGCAGTA 
GAACTTTCCG 
GAAAAGCCAG 
CTGGGACATG 
ATGTGCATGA 
AATATGTTCT 
CTTTATATAT 
AATCTTTCTG 
ACAAGTGTCA 
AGTAAATTAA 



AGATCATATA 
TTAGACCAAA 
ATGCAGCATT 
ATTGATAACC 
GTTACTCAGT 
GTTTGTACTC 
AGGGCAGAAT 
TTGCCAAGGA 
CAGTCAGACC 
AATCTGCAGT 
TTGAAAGTAA 
AACTGTCCAA 
GATGTAATTG 
AGGGGTTTCA 
AAAGAACGTT 
GATGTTATAG 
CGATACATTG 
AATATCAGCA 
TTAATTATAC 
ACCCCTCCTG 
ATAGATGATG 
CAGAACAATA 
AAAGGTCGAG 
GCAAGTCTTC 
TGTTTACAAA 
GACCCACCAT 
TTGGATAAAC 
CCACATATTG 
ACTATTGCTG 
ATTGCAGATG 
GTGAATGCGT 
TATTGCTGGG 
CAGTTTGCTG 
GAATCTAATA 
TATCCCAAAG 
TACACAAAAA 
GACTTTCACT 
TATGACTGCA 
CAGAAGGATA 
GCAAGAATTG 
AAGATTGAAA 
CTGTCAGCTA 
CCACGATTCC 
TTTGACAGCC 
AACAATTTTC 
CTTGATGTTA 
CTGATCATAT 
ATTGAGTATT 
CTCATAATGA 
ATTAAGAATT 
TTTGTTGTAA 



TTGACCGAGA 
AATTATTGGA 
TCAGAGAAAA 
ATCAGGTAAC 
TCATTTTGGA 
AGCCAAGAAG 
CTTGTGGCAG 
AACAGGGTTC 
CGTATTTGTC 
CAGATGTTTT 
TATTGATGAG 
TGATACATAT 
AAAAAATAAG 
TGCAAGGGCA 
GGCCAGATTA 
AAATGATGGA 
TTTTGGAAGA 
CTTTACATGA 
CTTTACATTC 
GTGTTCGGAA 
TCGTTTATGT 
TCAGTACAAT 
CTGGAAGAGT 
TAGATGACTA 
TAAAGATTTT 
CAAATGAGGC 
AAGAAGAATT 
GAAAAATGAT 
CTAGTCTCAG 
CAAGAAGAAA 
TTGAGGGCTG 
AATATTTTCT 
AGCATCTTCT 
TAAATTCAGA 
TTGCTAAAAT 
CCGATGGCCT 
ACAACTGGCT 
CAGAGGTTTC 
ACGATCAGGA 
CCCATCTTGT 
GTCCTCATCC 
TTATAGACTT 
AGGATGGATA 
ATTCTTCATC 
ATGTGTAAGG 
TATGTAGAGA 
ACTCTGCTGT 
GTACCACTTG 
TTGATGATAC 
TGAACACAAC 
TAAAGTCCAG 



TTCTGAGTAT 
AGATTTACAA 
GCTGCCTTCG 
AGTAATAAGT 
TAACTACATT 
AATTAGTGCC 
TGGTAATAGT 
TATCTTATAC 
CAGTGTTAGT 
AATGACTGTT 
TGCAACATTG 
ACCTGGTTTT 
GTATGTTCCA 
TGTAAATAGA 
TGTAAGGGAA 
GGATGATAAA 
AGAGGATGGT 
TCTCTTGATG 
ACTGATGCCT 
AATAGTAATT 
GATAGATGGA 
GTCCGCTGAG 
TCAACCTGGT 
TCAACTGCCA 
AAGGCTAGGT 
AGTGTTACTC 
GACACCTCTT 
TCTTTTTGGA 
TTTCAAAGAT 
GGAATTGGCA 
GGAAGAGGCT 
GTCTTCAAAC 
TGGAGCTGGA 
TAATGAGAAG 
TCGACTAAAT 
GGTTGCTGTT 
TATCTATCAC 
CCCATACTGT 
AACTATTGCT 
TAAGGAATTA 
TGTAGACTGG 
GATCAAAACA 
TTACAGCTGA 
ATTGTTTAAA 
TAGAAGCCTT 
TATATATATA 
GGTCATGCCC 
AGAAATTCCT 
CACCAGTAAA 
CACATTTTTT 
TATTTAATAA 



CTCTTGCAAG 
AAGAAAAAAA 
TATGGAATGC 
GGTGAAACTG 
GAAAGAGGAA 
ATTTCAGTTG 
ACTGGATATC 
TGTACAACAG 
CATATCGTAC 
GTTAAAGACC 
AATGCAGAAA 
ACCTTTCCGG 
GAACAAAAAG 
CAAGAAAAAG 
CTGCGAAGAA 
GTTGATCTGA 
GCGATACTGG 
TCACAAGTAA 
ACAGTTAACC 
GCTACCAACA 
GGAAAAATAA 
TGGGTTAGTA 
CATTGCTATC 
GAAATTTTGA 
GGAATTGCTT 
TCCATAAGAC 
GGAGTCCACT 
GCACTGTTCT 
CCATTTGTCA 
AAGGATACTA 
AGGCGACGTG 
ACACTGCAGA 
TTTGTAAGCA 
ATAATTAAAG 
TTGGGTAAAA 
CATCCTAAAT 
CTAAAGATGA 
CTCTTGTTTT 
GTAGATGAGT 
AGAAAGGAAC 
AATGACACTA 
CAGGAAAAGG 
CAGCTTTTCA 
TTTTGGCTGG 
CAGTAGGTAG 
TATATATATA 
ACTCTTTGGG 
TTGTTCTGTT 
AATAGGATGT 
AAAATGAAAC 
AATGTACAAT 



AAAATGAACC 
ATGACCTTCG 
AAAAGGAATT 
GTTGTGGCAA 
AAGGATCTGC 
CGGAAAGAGT 
AAATTCGTCT 
GAATCATCCT 
TTGATGAAAT 
TTCTCAATTT 
AGTTTTCAGA 
TTGTGGAATA 
AACACAGATC 
AAGAAAAAGA 
GGTATTCTGC 
ATTTGATTGT 
TCTTTCTGCC 
TGTTTAAATC 
AGACACAGGT 
TTGCGGAGAC 
AAGAGACGCA 
AAGCTAATGC 
ATCTGTATAA 
GAACTCCTTT 
ATTTTCTGAG 
ACCTGATGGA 
TGGCACGATT 
GCTGCTTAGA 
TTCCACTGGG 
GAAGTGATCA 
GTTTCAGATA 
TGCTGCATAA 
GTAGAAATCC 
CTGTCATCTG 
AAAGAAAAAT 
CTGTTAATGT 
GAACAAGCAG 
TTGGAGGTGA 
GGATTGTATT 
TAGATATTCT 
AATCCAGAGA 
CAACTCCCAG 
GGGGTGGTCT 
ATGCCAAACC 
TAAAGACTTA 
CCATAAAAGC 
AGTATATTCC 
ATACAAAATT 
TTACCCCAAA 
TTCTATCGGA 
GTTAAATCTC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 



SEQ ID NQ:276 PBY3 Protein sequence: 
Protein Accession #: BAA9601 2 

IRNRSYIDRD SEYLLQENEP DGTLDQKLLE DLQKKKNDLR YIEMQHFREK LPSYGMQKEL 60 
VNUDNHQVT VISGETGCGK TTQVTQFILD NYIERGKGSA CRIVCTQPRR ISAISVAERV 120 
AAERAESCGS GNSTGYQIRL QSRLPRKQGS ILYCTTGIIL QWLQSDPYLS SVSHIVLDEI 180 
HERNLQSDVL MTVVKDLLNF RSDLKVILMS ATLNAEKFSE YFGNCPMIHI PGFTFPVVEY 240 
LLEDVIEKIR Y VPEQKEHRS QFKRGFMQGH VNRQEKEEKE AIYKERWPDY VRELRRRYSA 300 
STVDVIEMME DDKVDLNLIV ALIRYIVLEE EDGAILVFLP GWDNISTLHD LLMSQVMFKS 360 
DKFLIIPLHS LMPTVNQTQV FKRTPPGVRK IVIATNIAET SITIDDVVYV HX3GKIKETH 420 
FDTQNNISTM S AEWVSKAN A KQRKGRAGRV QPGHCYHLYN GLRASLLDDY QLPEILRTPL 480 
EELCLQIKIL RLGGIAYFLS RLMDPPSNEA VLLSIRHLME LNALDKQEEL TPLGVHLARL 540 
PVEPHIGKMI LFGALFCCLD PVLTIAASLS FKDPFVIPLG KEKJADARRK ELAKDTRSDH 600 
LTVVN AFEGW EEARRRGFRY EKDYCWEYFL SSNTLQMLHN MKGQFAEHLL GAGFVSSRNP 660 
KDPESN1NSD NEKIIKAVIC AGLYPKVAKI RLNLGKKRKM VKVYTKTDGL VAVHPKSVNV 720 
EQTDFHYNWL IYHLKMRTSS IYLYDCTEVS PYCLLFFGGD ISIQKDNOQE TIAVDEWIVF 780 
QSPARIAHLV KELRKELDIL LQEKIESPHP VDWNDTKSRD CAVLSAIIDL IKTQEKATPR 840 
NFPPRFQDGY YS 



SEQ ID NO:277 PBY6 DNA SEQUENCE 

Nudeic Acid Accession*: AA46401 8 
7 5 Cooing sequence: 64-1 669(underi)ned sequence corresponds to start and stop codon) 



GATTTTATCC TGGAACATTA CAGTGAAGAT GGCTATTTAT ATGAAGATGA AATTGCAGAT 60 
CTTAIGG ATC TG AG AC A AGC TTGTCGGACG CCTAGCCGGG ATG AGGCCGG GGTGG AACTG 1 20 
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CTGATGACAT ACTTCATCCA GCTGGGCTTT GTCGAGAGTC GATTCTTCCC GCCCACACGG 180 
CAGATGGG AC TCCTGTTCAC CTGGTATGAC TCTCTCACCG GGGTTCCGGT CAGCCAGCAG 240 
AACCTGCTGC TGGAGAAGGC CAGTGTCCTG TTCAACACTG GGGCCCTCTA CACCCAGATT 300 
GGG ACCCGGT GTG ATCGGCA G ACGC AGGCT GGGCTGG AG A GTGCCATAGA TGCCTTTCAG 360 
5 AG AGCCGC AG GGGTTTTAAA TTACCTGAAA GACACATTTA CCCATACTCC AAGTTACGAC 420 
ATGAGCCCTG CCATGCTCAG CGTGCTCGTC AAAATGATGC TTGCACAAGC CCAAGAAAGC 480 
GTGTTTGAGA AAATCAGCCT TCCTGGGATC CGG AATGAAT TCTTCATGCT GGTGAAGGTG 540 
GCTCAGGAGG CTGCTAAGGT GGG AGAGGTC TACCAACAGC TACACGCAGC CATGAGCCAG 600 
GCGCCGGTGA AAG AG AAC AT CCCCTACTCC TGGGCCAGCT TAGCCTGCGT GAAGGCCCAC 660 

1 0 CACTACGCGG CCCTGGCCCA CTACTTCACT GCCATCCTCC TCATCGACCA CCAGGTG AAG 720 
CCAGGCACGG ATCTGGACCA CCAGGAGAAG TGCCTGTCCC AGCTCTACGA CCACATGCCA 780 
GAGGGGCTGA CACCCTTGGC CACACTGAAG AATG ATCAGC AGCGCCGACA GCTGGGG AAG 840 
TCCCACTTGC GCAG AGCCAT GGCTCATCAC GAGGAGTCGG TGCGGGAGGC CAGCCTCTGC 900 
AAGAAGCTGC GGAGCATTGA GGTGCTACAG AAGGTGCTGT GTGCCGCACA GGAACGCTCC 960 

1 5 CGGCTCACGT ACGCCCAGCA CC AGGAGGAG G ATGACCTGC TGAACCTGAT CG ACGCCCCC 1020 
AGTGTTGTTG CTAAAACTGA GCAAGAGGTT GACATTATAT TGCCCCAGTT CTCCAAGCTG 1080 
ACAGTCACGG ACTTCTTCCA GAAGCTGGGC CCCTTATCTG TGTTTTCGGC TAACAAGCGG 1 140 
TGGACGCCTC CTCGAAGCAT CCGCTTCACT GCAG A AG AAG GGGACTTGGG GTTCACCTTG 1200 

_ _ AGAGGGAACG CCCCCGTTCA GGTTCACTTC CTGGATCCTT ACTGCTCTGC CTCGGTGGCA 1260 

20 GGAGCCCGGG AAGGAGATTA TATTGTCTCC ATTCAGCTTG TGG ATTGTAA GTGGCTG ACG 1 320 
CTGAGTGAGG TTATGAAGCT GCTGAAGAGC TTTGGCGAGG ACGAGATCGA GATGAAAGTC 1380 
GTGAGCCTCC TGGACTCCAC ATCATCCATG CATAATAAGA GTGCCACATA CTCCGTGGGA 1440 
ATGCAGAAAA CGTACTCCAT GATCTGCTTA GCCATTGATG ATGACGACAA AACTGATAAA 1500 
ACCA AG A AA A TCTCC A AG AA GCTTTCCTTC CTGAGTTGGG GCACCAACAA GAACAGACAG 1 560 

25 AAGTCAGCCA GCACCTTGTG CCTCCCATCG GTCGGGGCTG CACGGCCTCA GGTCAAGAAG 1620 
A AGCTGCCCT CCCCTTTC AG CCTTCTCA AC TCAG ACAGTT CTTGGTA CTA A 



SEQ ID NO:278 PBY6 Protein sequence: 
30 Protein Accession*: NPJ49094 

DF1LEHYSED GYLYEDEIAD LMDLRQACRT PSRDEAGVEL LMTYFIQLGF VESRFFPPTR 60 
QMGIXFTWYD SLTGVPVSQQ NLLLEKASVL FNTGALYTQI GTRCDRQTQA GLESAIDAFQ 120 
RAAGVLNYLK DTFTHTPS YD MSPAMLS VLV KMMLAQAQES VFEK1SLPGI RNEFFMLVKV 1 80 

3 5 AQEAAKVGEV YQQLHAAMSQ APVKENIPYS W ASLACVKAH H Y AALAHYFT AILLIDHQVK 240 
PGTDLDHQEK CLSQLYDHMP EGLTPLATLK NDQQRRQLGK SHLRRAMAHH EESVREASLC 300 
KKLRSCEVLQ KVLCAAQERS RLTYAQHQEE DDLLNLEDAP SVVAKTEQEV DIILPQFSKL 360 
TVTDFFQKLG PES VFSANKR WTPPRSIRFT AEEGDLGFTL RGNAPVQVHF LDPYCSASVA 420 
GAREGDYIVS IQLVDCKWLT LSEVMKLLKS FGEDEIEMKV VSLLDSTSSM HNKSATYSVG 480 

40 MQKTYSMICL AIDDDDKTDK TKKISKKLSF LSWGTNKNRQ KSASTLCLPS VGAARPQVKK 540 
KLPSPFSLLN SDSSWY 



50 



SEQ ID NO:279 P8Y8 DNA SEQUENCE 

45 Nucleic Acid Accession*: AF1 07493 

Coding sequence: 1 25-556 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I i I I I 

GAATTCGGCA CGAGCCTTGT TGGAGGTTCT GGGGCGCAGA ACCGCTACTG CTGCTTCGGT 60 

CTCTCCTTGG GAAAAAATAA AATTTGAACC TTTTGGAGCT GTGTGCTAAA TCTTCAGTGG 120 

GAC AATG GGT TCAGACAAAA GAGTGAGTAG AACAGAGCGT AGTGGAAGAT ACGGTTCCAT 180 

CATAGACAGG GATGACCGTG ATGAGCGTGA ATCCCGAAGC AGGCGGAGGG ACTCAGATTA 240 

. CAAAAGATCT AGTGATGATC GGAGGGGTGA TAGATATGAT GACTACCGAG ACTATGACAG 300 

55 TCCAGAGAGA GAGCGTGAAA GAAGGAACAG TGACCGATCC GAAGATGGCT ACCATTCAGA 360 

TGGTGACTAT GGTGAGCACG ACTATAGGCA TGACATCAGT GACGAGAGGG AGAGCAAGAC 420 

CATCATGCTG CGCGGCCTTC CCATCACCAT CACAGAGAGC GATATTCGAG AAATGATGGA 480 

GTCCTTCGAA GGCCCTCAGC CTGCGGATGT GAGGCTGATG AAGAGGAAAA CAGGTGAGAG 540 

CTTGCTTAGT TCCTGATATT ATTGTTCTCT TCCCCATTCC CACCTCAGTC CCTAAAGAAC 600 

ATCCTGATTC CCCCAGTCTT CAAGCACATG AATTCAGAAT GAAAGGTTTG CCATGGCTAA 660 

GGAATGTGAC TCTTTGAAAA CCATGTTAGC ATCTGAGGAA CTTTTTTAAA CTTTGTTTTA 720 

GGGACTTTTT TTTCCTTAGG TAAGTAATGA TTTATAAACT CCTTTTTTTT TTTGACTATA 780 

GTCGGTTGCA TGGTTACTTT AAGCGTGGAA TCAAATGGAG TGGCATTTAG TTCAGGCGGC 840 

TTGTTCCTTG CCATGGCAAA GTATCAAGAA GATCCCCAAG TCAAGTCACA TTTGTAAAGC 900 

TGCTTCCCAA TTGGCTTTGT CACGCAGTGT TGAAGCAGTG GGAGAGAGAT TCACCTGTTA 960 

TAAAGGAACT GACTAACACA AGTATCCCGT CTATATCTGA ATGCTGTCTC TAGGTGTAAG 1020 

CCGTGGTTTC GCCTTCGTGG AGTTTTATCA CTTGCAAGAT GCTACCAGCT GGATGGAAGC 1080 

CAATCAGGTT GCTTCACTCA CCAAGTCTAG ATATTCATGA AAATGGAACA AGTCTGTACA 114 0 

ATTTTAAAAA AAGGTTGAAG GAGTGGTTTG TTCCAAAGGA GTGACTTTTT TTTAAAAAAA 1200 

AAGCTTTGTA TATATTAAAA TTGATGTTAC TAGAATAAGT ACAGTACCAA GGACTTCATT 1260 

ATAGAATTTG TTCTGCCTTT AAACATGGCT ACCTACCTGG CAGGGCTTTG TTAACTACTG 1320 

AATACCTGTC TGGTAATCAC TAAAACATCT TTATGTTTCC CTTTTTTCTA GTTTG TTATA 1380 

TTCCTATTAT GTCCATTGAG AGTAAGCTTA GTATATCAAA CTCTCCATTT GACAGTGAAG 1440 

AGAACATAGT GAAAGTCTGT GGCGGCATTT TTATAAGTAA TTCCTTATTT CTGCCTGAAG 1500 

ACCACAAAGC CTCCTGGAGG CGTAACTGCT CAGACCGGTC TTCAGGGAAT ATTTAAGGAC 1560 

TTAGTGGAAT TTATGAACAA TAAGTCTGAT GAGATTAGCC TGGGAGTGGT GTCCTGCAGC 1620 

TGTCTAATCT AGAGTGGCAT TAACATTCTA ATCTCCTTGA GAATGCCTTT TATAGTCTGT 1680 

TCAAAGCAAG TCATTGATGG TTCTTCGAGG TAGTGTTAAC TGAAGTGTTC TTCAGTTTGT 1740 

CAAGATAATG TTCAGTGCTT GGCACTTAAA TAACATTTTT TGCAAGAACT CCAAGGCACA 1800 



60 
65 
70 
75 
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TTATTGAATG CCTTTAACCA AGTGCATTCT GGG AAGTTTG 
TTCTGCAGCA TTCTGTGATT TGAGTCATCC ATGAATCCAT 
GATTGGTAAT ATTGCCATTT ATAACAAGAC TCACTAATGA 
GATTTGTTAA AGTTTTTAAG CCTCTCATTT TCCTAACCCA 
TTAAAAGTAG AGCTTCATTC ATTTCATACC ATAGATACCA 
ATACAAGGTT CATGTGAGTC TGCTTTCTTG ACATGATAGC 
TGTCAGAATG ACTAACCTAG GAGTTTGAAA CTCCTAAGAA 
TAAAAGTCTC C AC AATTTT A ATGTATACAA AGCTATGTTA 
CAAATTCACT CCAGAAATAA AAGGCCAGTA GGATTAGGGA 
TCCCAGCACA CATCCCTCCT AGTGGGATGA TCTATTCACA 
TTTGCTTCTG TATATCACAG TGAGTGGATG GCCCTTCAGC 
ATGCAGTCTT GCCTTTAGAT ATCGCAGAGA CAAAATTCAC 
GGATTTGCAA GAACCAAATT GCTCAACAGT ATGTATGTTT 
TAAAATCTGG ATATCTAACC ACCTACTTAA ATCTGTTTGA 
CCTTGATCCT CCCACCCCCA AAAAAAAAAA AAAA 



SEQ ID NO:280 PBY8 Protein sequence: 
Protetn Accession* XP.003261 

MGSDKRVSRT ERSGRYGSII DRDDRDERES RSRRRDSDYK RSSDDRRGDR YDDYRDYDSP 60 
ERERERRNSD RSEDG YHSDG DYGEHDYRHD ISDERESKTI MLRGLPITIT ESDIREMMES 1 20 
FEGPQPADVR LMKRKTGESL LSS 



SEQ tO NO:281 PCI2 DNA SEQUENCE 

Nucleic Acid Accession*: AF20829 1 

Coding sequence: 109-3705 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I i I i i i 

CGGCCGCTTT TTTCTCAAGA TGGCAGATTC CCACTGAGGC TGAGGGGGCC GAGCTCGCGC 60 

GCCGCGTTCC CTTCTCCGTT GCCATGAACC GCGGACACCC CGGCCCC GAT GG CCCCCGTG 120 

TACGAAGGTA TGGCCTCACA TGTGCAAGTT TTCTCCCCTC ACACCCTTCA ATCAAGTGCC 180 

TTCTGTAGTG TGAAGAAACT AAAAGTAGAG CCAAGTTCCA ACTGGGACAT GACTGGGTAC 240 

GGCTCCCACA GCAAAGTGTA CAGCCAGAGC AAGAACATAC CACCTTCTCA GCCAGCCTCC 300 

ACAACCGTCA GCACCTCCTT GCCGGTCCCA AACCCAAGCC TACCTTACGA GCAGACCATC 360 

GTCTTCCCAG GAAGCACCGG GCACATCGTG GTCACCTCAG CAAGCAGCAC TTCTGTCACC 420 

GGGCAAGTCC TCGGCGGACC ACACAACCTA ATGCGTCGAA GCACTGTGAG CCTCCTTGAT 480 

ACCTACCAAA AATGTGGACT CAAGCGTAAG AGCGAGGAGA TCGAGAACAC AAGCAGCGTG 540 

CAGATCATCG AGGAGCATCC ACCCATGATT CAGAATAATG CAAGCGGGGC CACTGTCGCC 600 

ACTGCCACCA CGTCTACTGC CACCTCCAAA AACAGCGGCT CCAACAGCGA GGGCGACTAT 660 

CAGCTGGTGC AGCATGAGGT GCTGTGCTCC ATGACCAACA CCTACGAGGT CTTAGAGTTC 720 

TTGGGCCGAG GGACGTTTGG ACAAGTGGTC AAGTGCTGGA AACGGGGCAC CAATGAGATC 7 80 

GTAGCCATCA AGATCCTGAA GAACCGCCCA TCCTATGCCC GACAAGGTCA GATTGAAGTG 840 

AGCATCCTGG CCCGGTTGAG CACGGAGAGT GCCGATGACT ATAACTTCGT CCGGGCCTAC 900 

GAATGCTTCC AGCACAAGAA CCACACGTGC TTGGTCTTCG AGATGTTGGA GCAGAACCTC 960 

TATGACTTTC TGAAGCAAAA CAAGTTTAGC CCCTTGCCCC TCAAATACAT TCGCCCAGTT 1020 

CTCCAGCAGG TAGCCACAGC CCTGATGAAA CTCAAAAGCC TAGGTCTTAT CCACGCTGAC 1080 

CTCAAACCAG AAAACATCAT GCTGGTGGAT CCATCTAGAC AACCATACAG AGTCAAGGTC 1140 

ATCGACTTTG GTTCAGCCAG CCACGTCTCC AAGGCTGTGT GCTCCAOCTA CTTGCAGTCC 1200 

AGATATTACA GGGCCCCTGA GATCATCCTT GGTTTACCAT TTTGTGAGGC AATTGACATG 1260 

TGGTCCCTGG GCTGTGTTAT TGCAGAATTG TTCCTGGGTT GGCCGTTATA TCCAGGAGC T 1320 

TCGGAGTATG ATCAGATTCG GTATATTTCA CAAACACAGG GTTTGCCTGC TGAATATTTA 1380 

TTAAGCGCCG GGACAAAGAC AACTAGGTTT TTCAACCGTG ACACGGACTC ACCATATCCT 1440 

TTGTGGAGAC TGAAGACACC AGATGACCAT GAAGCAGAGA CAGGGATTAA GTCAAAAGAA 1500 

GCAAGAAAGT ACATTTTCAA CTGTTTAGAT GATATGGCCC AGGTGAACAT GACGACAGAT 1560 

TTGGAAGGGA GCGACATGTT GGTAGAAAAG GCTGACCGGC GGGAGTTCAT TGACCTGTTG 1620 

AAGAAGATGC TGACCATTGA TGCTGACAAG AGAATCACTC CAATCGAAAC CCTGAACCAT 1680 

CCCTTTGTCA CCATGACACA CTTACTCGAT TTTCCCCACA GCACACACGT CAAATCATGT 1740 

TTCCAGAACA TGGAGATCTG CAAGCGTCGG GTGAATATGT ATGACACGGT GAACCAGAGC 1800 

AAAACCCCTT TCATCACGCA CGTGGCCCCC AGCACGTCCA CCAACCTGAC CATGACCTTT 1860 

AACAACCAGC TGACCACTGT CCACAACCAG GCTCCCTCCT CTACCAGTGC CACTATTTCC 1920 

TTAGCCAATC CCGAAGTCTC CATACTAAAC TACCCATCTA CACTCTACCA GCCCTCAGCG 1980 

GCATCCATGG CTGCAGTGGC CCAGCGGAGC ATGCCCCTGC AGACAGGAAC AGCCCAGATT 2040 

TGTGCCCGGC CTGACCCGTT CCAGCAAGCT CTCATCGTGT GTCCCCCCGG CTTCCAAGGC 2100 

TTGCAGGCCT CTCCCTCTAA GCACGCTGGC TACTCGGTGC GAATGGAAAA TGCAGTTCCC 2160 

ATCGTCACTC AAGCCCCAGG AGCTCAGCCT CTTCAGATCC AACCAGGTCT GCTTGCCCAG 2220 

CAGGCTTGGC CAAGTGGGAC CCAGCAGATC CTGCTTCCCC CAGCATGGCA GCAACTGACT 2280 

GGAGTGGCCA CCCACACATC AGTGCAGCAT GCCACCGTGA TTCCCGAGAC CATGGCAGGC 2340 

ACCCAGCAGC TGGCGGACTG GAGAAATACG CATGCTCACG GAAGCCATTA TAATCCCATC 2400 

ATGCAGCAGC CTGCACTATT GACCGGTCAT GTGACCCTTC CAGCAGCACA GCCCTTAAAT 2460 

GTGGGTGTGG CCCACGTGAT GCGGCAGCAG CCAACCAGCA CCACCTCCTC CCGGAAGAGT 2520 

AAGCAGCACC AGTCATCTGT GAGAAATGTC TCCACCTGTG AGGTGTCCTC CTCTCAGGCC 2580 

ATCAGCTCCC CACAGCGATC CAAGCGTGTC AAGGAGAACA CACCTCCCCG CTGTGCCATG 2640 

GTGCACAGTA GCCCGGCCTG CAGCACCTCG GTCACCTGTG GGTGGGGCGA CGTGGCCTCC 2700 

AGCACCACCC GGGAACGGCA GCGGCAGACA ATTGTCATTC CCGACACTCC CAGCCCCACG 2760 

GTCAGCGTCA TCACCATCAG CAGTGACACG GACGAGGAGG AGGAACAGAA ACACGCCCCC 2820 

ACCAGCACTG TCTCCAAGCA AAGAAAAAAC GTCATCAGCT GTGTCACAGT CCACGACTCC 2880 

CCCTACTCCG ACTCCTCCAG CAACACCAGC CCCTACTCCG TGCAGCAGCG TGCTGGGCAC 2940 



CTTGACTCAT TATCTTGCTT 1860 

GAATAAAAGT TACATTCTTT 192 0 

GGGTATCACT TTGACTGACT 1980 

GAAATCACAG CCTGATTTTA 2040 

TCCTAGTAAA TCCAGAACAT 2100 

ATTGTTTGAT GCAGTGGATA 2160 

ACTAAAACCT GTAAGACATT 2220 

CTGTGTAACA CATTACAGTT 2280 

CTCACTGGTA GTTTGGAGTC 2340 

TATCTCCCAG CTTTTTTATT 2400 

TTTTTCTCTC CTGGCCAGAC 2460 

AGCATGTCTT AAATCTTCCA 2520 

AGAGGGGTTA GACTCCTTTT 2580 

TAGTGTCAAA CCACCCCCAC 2640 
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AACAATGCCA ATGCCTTTGA CACCAAGGGG 
CGAACCATCA TCGTGCCACC CCTGAAAACC 
AGCCTGGTGC CAGTCAACAC CAGTCACCAC 
AACGTGACCT CCACCAGCGG TCACTCTTCA 
CAGCAGCGGC CGGGCCCCCA CTTCCAGCAG 
CAGCACATCA CCACGGACCG CACTGGGAGC 
ACCATGGCCC AGGCTCCGTA CTCCTTCCCG 
CCGCATCTGG CTGCAGCCGC TGCCGCTGCC 
TACACTGCGC CGGCGGCCCT GGGCTCCACC 
GGCTCTGCGC GCCACACCGT GCAGCACACT 
CCCGTGAGCA TGGGCCCCCG GGTCCTGCCC 
GCCCAATTTG CCCACCAGAC CTACATCAGC 
TACCCACTGA GCCCCGCCAA GGTCAACCAG 
GAGGGAGGGA GGGAGGGAGA GAATGGCCCG 
CCTGGGACCG TGGGCGCTGG CCTTTTATAC 
GGGCAGGGGC GGGGGGGGGG GGGGCAGAGG 
CTTGAACCGG GAAGTGGGAG GACGTAGAGC 
TTAAAGAGGG TGGGAAATCT ATGGTTTTTA 



AGCCTGGAGA ATCACTGCAC GGGGAACCCC 3000 

CAGGCCAGCG AAGTATTGGT GGAGTGTGAT 3060 

TCGTCCTCCT ACAAGTCCAA GTCCTCCAGC 3120 

GGGAGCTCAT CTGGAGCCAT CACCTACCGG 3180 

CAGCAGCCAC TCAATCTCAG CCAGGCTCAG 3240 

CACCGAAGGC AGCAGGCCTA CATCACTCCC 3300 

CACAACAGCC CCAGCCACGG CACTGTGCAC 3360 

CACCTCCCCA CCCAGCCCCA CCTCTACACC 3420 

GGCACCGTGG CCCACCTGGT GGCCTCGCAA 3480 

GCCTACCCAG CCAGCATCGT CCACCAGGTC 3540 

TCGCCCACCA TCCACCCGAG TCAGTATCCA 3600 

GCCTCGCCAG CCTCCACCGT CTACACTGGA 3660 

TACCCTTACA T ATAAA CACT GGAGGGGAGG 3720 

AGGGAGGAGG GAGAGAAGGA GGGAGGCGCT 3780 

TGAAGATGCC GCACACAAAC AATGCAAACG 3840 

GCAGGGGGAC GGGTCGGGAC ACCAGTGAAA 3900 

AGAGAAGAGA ACATTTTTAA AAGGAAGGGA 3960 
TTTTAAAAAA 



SEQ ID NO:282 PCI2 Protein sequence: 
Protein Accession #: NP_073577 

MAPVYEGMAS HVQVFSPHTL QSSAFCS VKK LKVEPSSNWD MTGYGSHSKV YSQSKNIPPS 60 
QPASTTVSTS LPVPNPSLPY EQTIVFPGST GHJ VVTS ASS TS VTGQVLGG PHNLMRRSTV 1 20 
SLLDTYQKCG LKRKSEEIEN TSSVQHEEH PPMIQNNASG ATVATATTST ATSKNSGSNS 1 80 
EGDYQLVQHE VLCSMTNTYE VLEFLGRGTF GQVVKCWKRG TNEIVAIKIL KNRPSYARQG 240 
QIEVSILARL STESADDYNF VRAYECFQHK NHTCLVFEML EQNLYDFLKQ NKFSPLPLKY 300 
IRPVLQQVAT ALMKLKSLGL IHADLKPENI MLVDPSRQPY RVKVIDFGSA SHVSKAVCST 360 
YLQSRYYRAP EliLGLPFCE AIDMWSLGCV 1AELFLGWPL YPGASEYDQI RYISQTQGLP 420 
AEYLLS AGTK TTRFFNRDTD SPYPLWRLKT PDDHEAETGI KSKEARKYIF NCLDDMAQVN 480 
MTTDLEGSDM LVEKADRREF IDLLKKMLTI DADKRITPIE TLNHPFVTMT HLLDFPHSTH 540 
VKSCFQNMEI CKRR VNM YDT VNQSKTPFIT H VAPSTSTNL TMTFNNQLTT VHNQAPSSTS 600 
ATISLANPEV SILNYPSTLY QPSAASMAAV AQRSMPLQTG TAQICARPDP FQQAUVCPP 660 
GFQGLQASPS KHAG YS VRME NAVPIVTQAP GAQPLQIQPG LLAQQAWPSG TQQILLPPAW 720 
QQLTGVATHT SVQHATVIPE TMAGTQQLAD WRNTHAHGSH YNPIMQQPAL LTGHVTLPAA 780 
QPLNVGVAHV MRQQPTSTTS SRKSKQHQSS VRNVSTCEVS SSQAISSPQR SKRVKENTPP 840 
RCAMVHSSPA CSTSVTCGWG DVASSTTRER QRQTIVIPDT PSPTVSVITI SSDTDEEEEQ 900 
KHAPTSTVSK QRKNVISCVT VHDSPYSDSS SNTSPYS VQQ RAGHNNANAF DTKGSLENHC 960 
TGNPRTIIVP PLKTQASEVL VECDSLVPVN TSHHSSSYKS KSSSNVTSTS GHSSGSSSGA 1020 
ITYRQQRPGP HFQQQQPLNL SQAQQHITTD RTGSHRRQQA VITPTMAQAP YSFPHNSPSH 1080 
GTVHPHLA A A AAAAHLPTQP HLYTYTAPAA LGSTGTVAHL VASQGSARHT VQHTAYPASI 1 140 
VHQVPVSMGP RVLPSPTIHP SQYPAQFAHQ TYISASPAST VYTGYPLSPA KVNQYPYI 

SEQ tO NO:283 PBY1 DMA SEQUENCE 

Nucleic Acid Accession*: NM_01 7700 

Coding sequence: 147-806 (underlined sequence corresponds to start and stop codon) 

1 11 21 31 41 51 

I I ! I I I 

AGTCACAGCC AGGTAACCCT GGAGTGAAGC GGTTTAGTTA GAAGGGAGCA GATAAACTCG 60 

TCACTCTAGT AGCTTTAACC CTCACCCTGA GGCACCTTAG CAATCAGCCA TTGCCTGCAA 120 

GCCTCCAAAG CTTGTCTTTG CCTAATATGG AGCCCAAAGA AGCCACTGGG AAAGAAAACA 180 

TGGTCACCAA GAAAAAGAAT CTGGCCTTCT TGAGGTCTAG ACTCTATATG CTGGAGAGAA 240 

GGAAGACTGA CACTGTGGTT GAGAGCAGTG TTTCTGGGGA CCACTCTGGC ACCTTGAGGA 300 

GGAGCCAATC TGACAGGACC GAATACAACC AGAAATTACA AGAAAAGATG ACTCCACAGG 360 

GTGAGTGTTC TGTAGCTGAG ACCTTAACCC CAGAGGAAGA GCATCATATG AAGAGGATGA 420 

TGGCAAAGCG GG AAAAGATC ATTAAGGAGC TGATACAGAC AGAAAAGGAT TATCTCAATG 480 

ATCTAGAGCT GTGTGTTAGG GAAGTGGTTC AGCCCCTGAG AAATAAAAAG AGTGATAGGC 540 

TGGATGTGGA TAGCTTGTTT AGCAACATTG AGTCCGTGCA TCAGATATCA GCCAAGCTGC 600 

TGTCATTGTT GGAAGAGGCC ACAACAGACG TGGAACCGGC CATGCAAGTA ATTGGAGAAG 660 

TATTCTTGCA GATTAAAGGG CCACTGGAAG ATATTTATAA AATCTACTGC TATCACCATG 720 

ATGAAGCACA TAGTATACTG GAGTCCTATG AAAAGGAAGA AGAGCTGAAG GAACATTTGA 780 

GCCACTGTAT CCAGTCCTTA AAGTAAGGCC TTTTCAAATG ATGATTCCCA TCTCCTCTCA 840 

GTTGCCTAGC AGGGAACATT TTAAATGGAT GTAGATGAAA GGTCTCACAT AAATCCTATG 900 

TTTTATGAGA CTTGCTGGGA GCTCTGCTTT GCATTCCCTT TATAAAAAGC TGACATGCCA 960 

GAAGCCCTGA TTGACTTTTT TTCCCCCTGC GAGAATGACT AAAAATAACA TGGAAGAAGA 1020 

TTTAGAGCTC TGCAGCGATT GAAAAATGCA ATATCAAAAT ATAAAATGTG GAAGAAAAGC 1080 

CTCTTCTTAA AGCTATTGTA ACTTGCCTGG CCCCACGTAG TTCAAGGATT ATGTGAGATA 1140 

ACACGTGGCC CCATGACCAC TGGAGCACAT GGGTTAATGG AGTTAGGGGA ATGGCCTACA 1200 

ACTCTGCATG GCCGTCTTCT TTCCCCAAAC TCACTGTGGG GAGATGGGTG AAGACAAGTC 1260 

AGGCCTTGTT AAAGTTAGTT TCAGAACAAT TACTCATGCC TTCCTTTCTC ATCCC TAAAA 1320 

CATTGGTGGG GGAGCTACAC AATGTACTTT TTCTTTTCTA GAGGAAGTAT CTATTCACTG 1380 

TGAAAATCTG AAAAATATAA CAAAGTATGT GTAAGATAAA AACCCCTTGC TATTTCAAAA 1440 
AAAAAAAAAA AAAAAAAAAA AAAA 

SEQ ID NO:284 P8Y1 Protein sequence: 
Protein Accession #: NP_060170 

l 11 21 31 41 51 

420 
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MEPKEATGKE NMVTKKKNLA FLRSRLYMLE RRKTDTWES SVSGDHSGTL RRSQSDRTEY 60 

NQKLQEKMTP QGEC SVAETL TPEEEHHMKR MMAKREKI IK ELIQTEKDYL NDLELCVREV 120 

VQPLRNKKTD RLDVDSLFSN IESVHQISAK LLSLLEEATT DVEPAMQVIG EVFLQIKGPL 180 
5 EDIYKIYCYH HDEAHSILES YEKEEELKEH LSHCIQSLK 



SEQ ID NO:285 PBQ9 DNA SEQUENCE 

Nucleic Acid Accession*: X66534 
1 0 Coding sequence: 523-2676 (underlined sequence corresponds to start and stop codon) 



1 11 21 31 41 51 

15 CCCTTATGGC GATTGGGCGG CTGCAGAGAC CAGGACTCAG TTCCCCTGCC CTAGTCTGAG 60 

CCTAGTGGGT GGGACTCAGC TCAGAGTCAG TTTTCAGAAG CAGGTTTCAG TTGCAGAGTT 120 

TTCCTACACT TTTCCTGCGC TAGAGCAGCG AGCAGCCTGG AACAGACCCA GGCGGAGGAC 180 

ACCTGTGGGG GAGGGAGCGC CTGGAGGAGC TTAGAGACCC CAGCCGGGCG TGATCTCACC 240 

ATGTGCGGAT TTGCGAGGCG CGCCCTGGAG CTGCTAGAGA TCCGGAAGCA CAGCCCCGAG 300 

20 GTGTGCGAAG CCACCAAGAC TGCGGCTCTT GGAGAAAGCG TGAGCAGGGG GCCACCGCGG 360 

TCTCCGGCCT GTCTGCACCC TGTCGCCTGA GCTGCCTGAC AGTGACAATG ACATCCCAGT 420 

TACCAGTGTC CTTGAATTGA TAGTGGCTTC TGTTTGTCAG TC TC AT AT AA GAACTACAGC 480 

TCATCAGGAG GAGATCGCAG CAGGGTAAGA GACACCAACA CCATGTTCTG CACGAAGCTC 540 

AAGGATCTCA AGATCACAGG AGAGTGTCCT TTCTCCTTAC TGGCACCAGG TCAAGTTCCT 600 

:*=25 AACGAGTCTT CAGAGGAGGC AGCAGGAAGC TCAGAGAGCT GCAAAGCAAC CGTGCCCATC 660 

W TGTCAAGACA TTCCTGAGAA GAACATACAA GAAAGTCTTC CTCAAAGAAA AACCAGTCGG 720 

~D AGCCGAGTCT ATCTTCACAC TTTGGCAGAG AGTATTTGCA AACTGATTTT CCCAGAGTTT 780 

% GAACGGCTGA ATGTTGC AC T TCAGAGAACA TTGGCAAAGC ACAAAATAAA AGAAAGCAGG 840 

'"'^ AAATCTTTGG AAAGAGAAGA CTTTGAAAAA ACAATTGCAG AGCAAGCAGT GCAGCAGAGT 900 

30 CCAGTGGAGT TATCAAAGAA TCTCTTGGTG AAGAGGTTTT TAAAATATGT TACGAGGAAG 960 

-~ ATGAAAACAT CCTTGGGGTG GTTGGAGGCA CCCTTAAAGA TTTTTAAACA GCTTCAGTAC 1020 

CCTTCTGAAA CAGAGCAGCC ATTGCCAAGA AGCAGGAAAA AGGGGCAGCT TGAGGACGCC 1080 

TCCATTCTAT GCCTGGATAA GGAGGATGAT TTTCTACATG TTTACTACTT CTTCCCTAAG 1140 

AGAACCACCT CCCTGATTCT TCCCGGCATC ATAAAGGCAG CTGCTCACGT ATTATATGAA 1200 

V'B5 ACGGAAGTGG AAGTGTCGTT AATGCCTCCC TGCTTCCATA ATGATTGCAG CGAGTTTGTG 1260 

-f ~ AATCAGCCCT ACTTGTTGTA CTCCGTTCAC ATGAAAAGCA CCAAGCCATC CCTGTCCCCC 1320 

^ AGCAAACCCC AGTCCTCGCT GGTGATTCCC ACATCGCTAT TCTGCAAGAC ATTTCCATTC 1380 

CATTTCATGT TTG AC AAAG A TATGACAATT CTGCAATTTG GCAATGGCAT CAGAAGGCTG 1440 

ATGAACAGGA GAG AC TTTC A AGGAAAGCCT AATTTTGAAT ACTTTGAAAT TCTGACTCCA 1500 

*M0 AAAATCAACC AGACCTTTAG CGGGATCATG AC T ATGTTG A ATATGCAGTT TGTTGTACGA 1560 

GTGAGGAGAT GGGACAACTC TGTGAAGAAA TCTTCAAGGG TTATGGACCT CAAAGGCCAA 1620 

" J ATGATCTACA TTGTTGAATC CAGTGCAATC TTGTTTTTGG GGTCACCCTG TGTGGACAGA 1680 

U> TTAGAAGATT TTACAGGACG AGGGCTCTAC CTCTCAGACA TCCCAATTCA CAATGCACTG 1740 

= AGGGATGTGG TCTTAATAGG GGAACAAGCC CGAGCTCAAG ATGGCCTGAA GAAGAGGCTG 1800 

"•^45 GGGAAGCTGA AGGCTACCCT TGAGCAAGCC CACCAAGCCC TGGAGGAGGA GAAGAAAAAG 1860 

r ':i ACAGTAGACC TTCTGTGCTC CATATTTCCC TGTGAGGTTG CTCAGCAGCT GTGGCAAGGG 1920 

CAAGTTGTGC AAGCCAAGAA GTTCAGTAAT GTCACCATGC TCTTCTCAGA CATCGTTGGG 1980 

TTCACTGCCA TCTGCTCCCA GTGCTCACCG CTGCAGGTCA TCACCATGCT CAATGCACTG 2040 

TACACTCGCT TCGACCAGCA GTGTGGAGAG CTGGATGTCT ACAAGGTGGA GACCATTGCG 2100 

50 ATGCCTATTG TGTGGCTTGG GGGATTACAC AAAGAGAGTG ATACTCATGC TGTTCAGATA 2160 

GCGCTGATGG CCCTGAAGAT GATGGAGCTC TCTGATGAAG TTATGTCTCC CCATGGAGAA 2220 

CCTATCAAGA TGCGAATTGG ACTGCACTCT GGATCAGTTT TTGCTGGCGT CGTTGGAGTT 2280 

AAAATGCCCC GTTACTGTCT TTTTGGAAAC AATGTCACTC TGGCTAACAA ATTTGAGTCC 2340 

TGCAGTGTAC CACGAAAAAT CAATGTCAGC CCAACAACTT ACAGATTACT CAAAGACTGT 2400 

55 CCTGGTTTCG TGTTTACCCC TCGATCAAGG GAGGAACTTC CACCAAACTT CCCTAGTGAA 2460 

ATCCCCGGAA TCTGCCATTT TCTGGATGCT TACCAACAAG GAACAAACTC AAAACCATGC 2520 

TTCCAAAAGA AAGATGTGGA AGATGCAAGC CAATTTTTTA GGCAAAGCAT CAGGAATAGA 2580 

TTAGCAACCT ATATACCTAT TTATAAGTCT TTGGGGTTTG ACTCATTGAA GATGTGTAGA 2640 

GCCTCTGAAA GCACTTTAGG GATTGTAGAT GGCTAACAAG CAGTATTAAA ATTTCAGGAG 2700 

60 CCAAGTCACA ATCTTTCTCC TGTTTAACAT GACAAAATGT ACTCACTTCA GTACTTCAGC 2760 

TCTTCAAGAA AAAAAAAAAA ACCTTAAAAA GCTACTTTTG TGGGAGTATT TCTATTATAT 2820 

AACCAGCACT TACTACCTGT ACTCAAAATT CAGCACCTTG TACATATATC AGATAATTGT 2 880 

AGTCAATTGT ACAAACTGAT GGAGTCACCT GCAATCTCAT ATCCTGGTGG AATGCCATGG 2940 

TTATTAAAGT GTGTTTGTGA TAGTTGTCGT CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 3000 
65 AAAA 



SEQ ID NO:286 PRQ9 Protein sequence: 
Protein Accession #: Q02108 

70 1 11 21 31 41 51 

MFCTKLKDLK ITGECPFSLL APGQVPNESS EEAAGSSESC KATVPICQDI PEKNIQESLP 60 

QRKTSRSRVY LHTLAESICK LIFPEFERLN VALQRTLAKH KIKESRKSLE REDFEKTIAE 120 

QAVAAGVPVE VIKESLGEEV FKICYEEDEN I LGWGGTLK DFLNSFSTLL KQSSHCQEAG 180 

75 KRGRLEDASI LCLDKEDDFL HVYYFFPKRT TSLILPGIIK AAAHVLYETE VEVSLMPPCF 240 

HNDCSEFVNQ PYLLYSVHMK STKPSLSPSK PQSSLVIPTS LFCKTFPFHF MFDKDMTILQ 300 

FGNGIRRLMN RRDFQGKPNF EEYFEILTPK INQTFSGIMT MLNMQFWRV RRWDNSVKKS 360 

SKVMDLKGQM IYIVESSAIL FLGSPCVDRL EDFTGRGLYL SDIPIHNALR DWLIGEQAR 420 

AQDGLKKRLG KLKATLEOAH QALEEEKKKT VDLLCSIFPC EVAQQLWQGQ WQAKKFSNV 480 

80 TMLFSDIVGF TAICSQCSPL QVITMLNALY TRFDQQCGEL DVYKVETIGD AYCVAGGLHK 540 

421 



ESDTHAVQ I A LMALKMMELS DEVMSPHGEP IKMRIGLHSG SVFAGWGVK MPRYCLFGNN 600 
VTLANKFESC SVPRKINVSP TTYRLLKDCP GFVFTPRSRE ELPPNFPSEI PGICHFLDAY 660 
QQGTNSKPCF QKKDVEDGNA NFLGKASGID 



SEQ ID N0.287 PFD2 ONA SEQUENCE 

Nucleic Acid Accession!: NM JXKJ720 

Coding sequence: 1 1 9-6664 (underlined sequence corresponds to start and stop codon) 



1 11 21 31 41 51 

I i I I i I 

AGAATAAGGG CAGGGACCGC GGCTCCTATC TCTTGGTGAT CCCCTTCCCC ATTCCGCCCC 60 

CGCCTCAACG CCCAGCACAG TGCCCTGCAC ACAGTAGTCG CTCAATAAAT GTTCGTGGAT 120 

GATGATGATG ATGATGATGA AAAAAATGCA GCATCAACGG CAGCAGCAAG CGGACCACGC 180 

GAACGAGGCA AACTATGCAA GAGGCACCAG ACTTCCTCTT TCTGGTGAAG GACCAACTTC 240 

TCAGCCGAAT AGCTCCAAGC AAACTGTCCT GTCTTGGCAA GCTGCAATCG ATGCTGCTAG 300 

ACAGGCCAAG GCTGCCCAAA CTATGAGCAC CTCTGCACCC CCACCTGTAG GATCTCTCTC 360 

CCAAAGAAAA CGTCAGCAAT ACGCCAAGAG CAAAAAACAG GGTAACTCGT CCAACAGCCG 420 

ACCTGCCCGC GCCCTTTTCT GTTTATCACT CAATAACCCC ATCCGAAGAG CCTGCATTAG 480 

TATAGTGGAA TGGAAACCAT TTGACATATT TATATTATTG GCTATTTTTG CCAATTGTGT 540 

GGCCTTAGCT ATTTACATCC CATTCCCTGA AGATGATTCT AATTCAACAA ATCATAACTT 600 

GGAAAAAGTA GAATATGCCT TCCTGATTAT TTTTACAGTC GAGACATTTT TGAAGATTAT 660 

AGCGTATGGA TTATTGCTAC ATCCTAATGC TTATGTTAGG AATGGATGGA ATTTACTGGA 720 

TTTTGTTATA GTAATAGTAG GATTGTTTAG TGTAATTTTG GAACAATTAA CCAAAGAAAC 780 

AGAAGGCGGG AACCACTCAA GCGGCAAATC TGGAGGCTTT GATGTCAAAG CCCTCCGTGC 840 

CTTTCGAGTG TTGCGACCAC TTCGACTAGT GTCAGGGGTG CCCAGTTTAC AAGTTGTCCT 900 

GAACTCCATT ATAAAAGCCA TGGTTCCCCT CCTTCACATA GCCCTTTTGG TATTATTTGT 960 

AATCATAATC TATGCTATTA TAGGATTGGA ACTTTTTATT GGAAAAATGC ACAAAACATG 1020 

TTTTTTTGCT GACTCAGATA TCGTAGCTGA AGAGGACCCA GCTCCATGTG CGTTCTCAGG 1080 

GAATGGACGC CAGTGTACTG CCAATGGCAC GGAATGTAGG AGTGGCTGGG TTGGCCCGAA 1140 

CGGAGGCATC ACCAACTTTG ATAACTTTGC CTTTGCCATG CTTACTGTGT TTCAGTGCAT 1200 

CACCATGGAG GGCTGGACAG ACGTGCTCTA CTGGGTAAAT GATGCGATAG GATGGGAATG 1260 

GCCATGGGTG TATTTTGTTA GTCTGATCAT CCTTGGCTCA TTTTTCGTCC TTAACCTGGT 1320 

TCTTGGTGTC CTTAGTGGAG AATTCTCAAA GGAAAGAGAG AAGGCAAAAG CACGGGGAGA 1380 

TTTCCAGAAG CTCCGGGAGA AGCAGCAGCT GGAGGAGGAT CTAAAGGGCT ACTTGGATTG 1440 

GATCACCCAA GCTGAGGACA TCGATCCGGA GAATGAGGAA GAAGGAGGAG AGGAAGGCAA 1500 

ACGAAATACT AGCATGCCCA CCAGCGAGAC TGAGTCTGTG AACACAGAGA ACGTCAGCGG 1560 

TGAAGGCGAG AACCGAGGCT GCTGTGGAAG TCTCTGGTGC TGGTGGAGAC GGAGAGGCGC 1620 

GGCCAAGGCG GGGCCCTCTG GGTGTCGGCG GTGGGGTCAA GCCATCTCAA AATCCAAACT 1680 

CAGCCGACGC TGGCGTCGCT GGAACCGATT CAATCGCAGA AGATGTAGGG CCGCCGTGAA 1740 

GTCTGTCACG TTTTACTGGC TGGTTATCGT CCTGGTGTTT CTGAACACCT TAACCATTTC 1800 

CTCTGAGCAC TACAATCAGC CAGATTGGTT GACACAGATT CAAGATATTG CCAACAAAGT 1860 

CCTCTTGGCT CTGTTCACCT GCGAGATGCT GGTAAAAATG TACAGCTTGG GCCTCCAAGC 1920 

ATATTTCGTC TCTCTTTTCA ACCGGTTTGA TTGCTTCGTG GTGTGTGGTG GAATCACTGA 1980 

GACGATCCTG GTGGAACTGG AAATCATGTC TCCCCTGGGG ATCTCTGTGT TTCGGTGTGT 2040 

GCGCCTCTTA AG AATC TTC A AAGTGACCAG GCACTGGACT TCCCTGAGCA ACTTAGTGGC 2100 

ATCCTTATTA AACTCCATGA AGTCCATCGC TTCGCTGTTG CTTCTGCTTT TTCTCTTCAT 2160 

TATCATCTTT TCCTTGCTTG GGATGCAGCT GTTTGGCGGC AAGTTTAATT TTGATGAAAC 2220 

GCAAACCAAG CGGAGCACCT TTGACAATTT CCCTCAAGCA CTTCTCACAG TGTTCCAGAT 2280 

CCTGACAGGC GAAGACTGGA ATGCTGTGAT GTACGATGGC ATCATGGCTT ACGGGGGCCC 2340 

ATCCTCTTCA GGAATGATCG TCTGCATCTA CTTCATCATC CTCTTCATTT GTGGTAACTA 2400 

TATTCTACTG AATGTCTTCT TGGCCATCGC TGTAGACAAT TTGGCTGATG CTGAAAGTCT 2460 

GAACACTGCT CAGAAAGAAG AAGCGGAAGA AAAGGAGAGG AAAAAGATTG CCAGAAAAGA 2520 

GAGCCTAGAA AATAAAAAGA ACAACAAACC AGAAGTCAAC CAGATAGCCA ACAGTGACAA 2580 

CAAGGTTACA ATTGATGACT ATAGAGAAGA GGATGAAGAC AAGGACCCCT ATCCGCCTTG 2640 

CGATGTGCCA GTAGGGGAAG AGGAAGAGGA AGAGGAGGAG GATGAACCTG AGGTTCCTGC 2700 

CGGACCCCGT CCTCGAAGGA TCTCGGAGTT GAACATGAAG GAAAAAATTG CCCCCATCCC 2760 

TGAAGGGAGC GCTTTCTTCA TTCTTAGCAA GACCAACCCG ATCCGCGTAG GCTGCCACAA 2820 

GCTCATCAAC CACCACATCT TCACCAACCT CATCCTTGTC TTCATCATGC TGAGCAGCGC 2880 

TGCCCTGGCC GCAGAGGACC CCATCCGCAG CCACTCCTTC CGGAACACGA TACTGGGTTA 2940 

CTTTGACTAT GCCTTCACAG CCATCTTTAC TGTTGAGATC CTGTTGAAGA TGACAACTTT 3000 

TGGAGCTTTC CTCCACAAAG GGGCCTTCTG CAGGAACTAC TTCAATTTGC TGGATATGCT 3060 

GGTGGTTGGG GTGTC TCTGG TGTCATTTGG GATTCAATCC AGTGCCATCT CCGTTGTGAA 3120 

GATTCTGAGG GTCTTAAGGG TCCTGCGTCC CCTCAGGGCC ATCAACAGAG CAAAAGGACT 3180 

TAAGCACGTG GTCCAGTGCG TCTTCGTGGC CATCCGGACC ATCGGCAACA TCATGATCGT 3240 

CACTACCCTC CTGCAGTTCA TGTTTGCCTG TATCGGGGTC CAGTTGTTCA AGGGGAAGTT 3300 

CTATCGCTGT ACGGATGAAG CCAAAAGTAA CCCTG AAGAA TGCAGGGGAC TTTTCATCCT 3360 

CTACAAGGAT GGGGATGTTG ACAGTCCTGT GGTCCGTGAA CGGATCTGGC AAAACAGTGA 3420 

TTTCAACTTC GACAACGTCC TCTCTGCTAT GATGGCGCTC TTCACAGTCT CCACGTTTGA 3480 

GGGCTGGCCT GCGTTGCTGT ATAAAGCCAT CGACTCGAAT GGAGAGAACA TCGGCCCAAT 3540 

CTACAACCAC CGCGTGGAGA TCTCCATCTT CTTCATCATC TACATCATCA TTGTAGCTTT 3600 

CTTCATGATG AACATCTTTG TGGGC TTTGT CATCGTTACA TTTCAGGAAC AAGGAGAAAA 3660 

AGAGTATAAG AACTGTGAGC TGGACAAAAA TCAGCGTCAG TGTGTTGAAT ACGCCTTGAA 3720 

AGCACGTCCC TTGCGGAGAT ACATCCCCAA AAACCCCTAC CAGTACAAGT TCTGGTACGT 3780 

GGTGAACTCT TCGCCTTTCG AATACATGAT GTTTGTCCTC ATCATGCTCA ACACACTCTG 3840 

CTTGGCCATG CAGCACTACG AGCAGTCCAA GATGTTCAAT GATGCCATGG ACATTCTGAA 3900 

CATGGTCTTC ACCGGGGTGT TCACCGTCGA GATGGTTTTG AAAGTCATCG CATTTAAGCC 3960 

TAAGGGGTAT TTTAGTGACG CCTGGAACAC GTTTGACTCC CTCATCGTAA TCGGCAGCAT 4020 

TATAGACGTG GCCCTCAGCG AAGCGGACCC AACTGAAAGT GAAAATGTCC CTGTCCCAAC 4080 
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5 

10 
15 
20 
25 
30 
35 
40 
45 
50 
55 



TGCTACACCT 
CCGAGTGATG 
GACTTTTATT 
CTTCATCTAT 
CCAGATCAAT 
GTGTGCAACA 
TGACCCTGAG 
TGTCTATTTC 
TGTCATCATG 
TTTAGATGAA 
ACACCTTGAT 
ATGTC C AC AC 
CGGGACAGTC 
GACCGAAGGG 
GAAGAAAACC 
AACCGTGGGG 
ACGGAAAGAA 
GGCGGGATTA 
TTTGCAAGAT 
AAATGGTGCC 
TCAGCAGACC 
AAGTGATACT 
TAACCATAAT 
CAATATGTCC 
TGAAAATGGG 
GAAAAGAACC 
AACTATTTGC 
GGAGCAGGAG 
CAGGCAAAAC 
CCGAGGCTAC 
TTCACGGAGA 
CTCCTTCAAC 
CATCTTCCCC 
TGCCGGCCTA 
GGCCACCCCT 
CCAAGTGGAG 
CAGCTCCTGG 
GACTGTCCCC 
GGTGGAGGCA 
GTCAGCAACA 
TGCAGCCAGC 
CCTCTCACAC 
GCCAGACCCT 
GTAGCCCCCA 
GAAAAGTGCC 
AGACTTTTGT 
CCAAGCGGTT 

ccccGcccrc 

TGGGCACTGC 
AGGCATGGCG 
CGTTACCTCA 
CCCTTTCCCC 



GGGAACTCTG 
CGATTGGTGA 
AAGTCCTTTC 
GCGGTCATTG 
AGGAACAATA 
GGTGAGGCCT 
TCAGATTACA 
ATCAGTTTTT 
GATAATTTCG 
TTCAAAAGAA 
GTGGTCACTC 
AGGGTAGCGT 
ATGTTTAATG 
AACCTGGAGC 
AGCATGAAAT 
AAGTTCTATG 
CAAGGACTGG 
AGGACACTGC 
GACGAGCCTG 
CTGCTTGGAA 
AATACCACCC 
GAGAAACCGC 
TCCATAGGAA 
AAAGCTGCCC 
CATCATTCTT 
CGCTATTATG 
CGGGAAGACC 
TATTTCAGTA 
TATGGCTACT 
CATCATCCCC 
TCTCCAAGGA 
TTTGAGTGCC 
CATCGCACGG 
GATTCAAGTA 
CCAGCAACCC 
CAGTCAGAGG 
TACACAGACG 
AGCAGCTTCC 
GTCCTGATAT 
AAACACGAAA 
ACCCTGCTTA 
CGGCAGGACT 
GGGAGGGATG 
GCGAGGGGCA 
TCATAGTTAG 
ATAAGAGATG 
GAGCCTGGCA 
TCACAGAGGA 
TGTGGAGTCT 
GCGGGGTGCA 
GCCATCGGTC 
CAAATACACT 



AAGAGAGCAA 
AGCTTCTCAG 
AGGCGCTCCC 
GCATGCAGAT 
ACTTCCAGAC 
GGCAGGAGAT 
ACCCCGGGGA 
ACATGCTCTG 
ACTATCTGAC 
TATGGTCAGA 
TGCTTCGACG 
GCAAGAGATT 
CAACCCTGTT 
AAGCTAATGA 
TACTTGACCA 
CCACTTTCCT 
TGGGAAAGTA 
ATGACATTGG 
AGGAAACAAA 
AC C ATGTC AA 
ACCGTCCCCT 
TGTTTCCTCC 
AGCAAGTTCC 
ATGGAAAGCG 
CCCACAAGCA 
AAACTTACAT 
CAGAGATACA 
GTGAGGAATG 
ACAGCAGATA 
AAGGATTCTT 
GACGCCTACT 
TGCGCCGGCA 
CCCTGCCTCT 
AAGCCCAGAA 
CTCCCTACCG 
CCCTGGACCA 
AGCCCGACAT 
GGAACAAAAA 
CCGAAGGCTT 
TCGC TG ATGC 
ATGGGAACGT 
ATGAGCTACA 
AGGAGGACCT 
GACTGGCTCT 
GAAAGTTTAG 
TCATGCCTCA 
GAGTACCATG 
TGGGTGAGGA 
GCTTCTCCCA 
GGGGAAAGTT 
TAGCATATCA 
GCGTCCTGGT 



TAGAATCTCC 
CAGGGGGGAA 
GTATGTGGCC 
GTTTGGGAAA 
GTTTCCCCAG 
CATGCTGGCC 
GGAGTATACA 
TGCATTTCTG 
CCGGGACTGG 
ATATGACCCT 
CATCCAGCCT 
AGTTGCCATG 
TGCTTTGGTT 
AGAACTTCGG 
AGTTGTCCCT 
GATACAGGAC 
CCCTGCGAAG 
GCCAGAAATC 
ACGAGAAGAA 
TCATGTTAAT 
GCATGTCCAA 
AGCAGGAAAT 
CACCTCAACA 
GCCCAGC ATT 
TGACCGGGAG 
TAGGTCCGAC 
TGGCTATTTC 
CTACGAGGAT 
CCCAGGCAGA 
GGAGGACGAT 
ACCTCCCACC 
GAGCAGCCAG 
GCATCTAATG 
GTACTCACCG 
GGACTGGACA 
GGTGAACGGC 
CTCCTACCGG 
CAGCGACAAG 
GGGACGCTAT 
CTGTGACCTC 
GCGTCCCCGA 
GGACTTTGGT 
GGCGGATGAA 
GGCCTCAGGT 
GCACTAGTTG 
AGAAAGCCAT 
CGCTCGGCCC 
GGCCAGACCT 
TGTACCAGGG 
AAAGGTGATG 
GTCACTGGGC 
TCCTGTTTAG 



ATCACCTTTT 
GGCATCCGGA 
CTCCTCATAG 
GTTGCCATGA 
GCGGTGCTGC 
TGTCTCCCAG 
TGTGGGAGCA 
ATCATCAATC 
TCTATTTTGG 
GAGGCAAAGG 
CCCCTGGGGT 
AAC ATGC C TC 
CGAACGGCTC 
GCTGTGATAA 
CCAGCTGGTG 
TACTTTAGGA 
AACACCACAA 
CGGCGTGCTA 
GAAGATGATG 
AGTGATAGGA 
AGGCCTTCAA 
TCGGTGTGTC 
AATGCCAATC 
GGGAACCTTG 
CCTCAGAGAA 
TCAGGAGATG 
AGGGACCCCC 
GACAGCTCGC 
AACATCGACT 
GACTCGCCCG 
CCAGCATCCC 
GAAGAGGTCC 
CAGCAACAGA 
AGTCACTCGA 
CCGTGCTACA 
AGCCTGCCGT 
ACTTTCACAC 
CAGAGGAGTG 
GCAAGGGACC 
ACCATCGACG 
GCCAACGGGG 
CCTGGCTACA 
ATGATATGCA 
GGGGCGCAGG 
GGAGTAATAT 
AAACCTGGTA 
CAGCTGCAGG 
GCCCTGCCCC 
CACCAGGCCC 
ACGATCATCA 
CCAACATATC 
CTGTTCTGAA 



TCCGTCTTTT 
CATTGCTGTG 
CCATGCTGTT 
GAGATAACAA 
TGCTCTTCAG 
GGAAGCTCTG 
ACTTTGCCAT 
TGTTTGTGGC 
GGCCTCACCA 
GAAGGATAAA 
TTGGGAAGTT 
TCAACAGTGA 
TTAAGATCAA 
AGAAAATTTG 
ATGATGAGGT 
AATTCAAGAA 
TTGCCCTACA 
TATCGTGTGA 
TGTTCAAAAG 
GAGATTCCCT 
TTCCACCTGC 
ATAACCATCA 
TCAATAATGC 
AGCATGTGTC 
GGTCCAGTGT 
AACAGCTCCC 
ACTGCTTGGG 
CCACCTGGAG 
CTGAGAGGCC 
TTTGCTATGA 
ACCGGAGATC 
CGTCGTCTCC 
TCATGGCAGT 
CCCGGTCGTG 
CCCCCCTGAT 
CCCTGCACCG 
CAGCCAGCCT 
CGGACAGCTT 
CAAAATTTGT 
AGATGGAGAG 
ATGTGGGCCC 
GCGACGAAGA 
TCACCACCTT 
AGAGCCAGGG 
TCAATTAATT 
GGAACAGGTC 
AAACAGCAGG 
ATTGTCCAGA 
ACCCAACTGA 
CACCTCGTGT 
CATTTTTAAA 
ATA 



4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 
5460 
5520 
5580 
5640 
5700 
5760 
5820 
5880 
5940 
6000 
6060 
6120 
6180 
6240 
6300 
6360 
6420 
6480 
6540 
6600 
6660 
6720 
6780 
6840 
6900 
6960 
7020 
7080 
7140 



SEO ID NO:288 PFD2 Protein sequence: 
Protein Accession #: A381 98 



60 
65 
70 
75 
80 



1 

I 

MMMMMMMKKM 
RQAKAAQTMS 
SIVEWKPFDI 
IAYGLLLHPN 
AFRVLRPLRL 
CFFADSDIVA 
ITMEGWTDVL 
DFQKLREKQQ 
GEGENRGCCG 
KSVTFYWLVI 
AYFVSLFNRF 
ASLLNSMKSI 
ILTGEDWNAV 
LNTAQKEEAE 
CDVPVGEEEE 
KLINHHIFTN 
FGAFLHKGAF 
LKHWQCVFV 
LYKDGDVDS P 
IYNHRVEISI 
KARPLRRYIP 
NMVFTGVFTV 



11 
I 

QHQRQQQADH 
TSAPPPVGSL 
FILLAIFANC 
AYVRNGWNLL 
VSGVPSLQW 
EEDPAPCAFS 
YWVNDAIGWE 
LEEDLKGYLD 
SLWCVfWRRRG 
VLVFLNTI/TI 
DCFWCGGIT 
ASLLLLLFLF 
MYDGIMAYGG 
EKERKKIARK 
EEEEDEPEVP 
LILVFIMLSS 
CRNYFNLLDM 
AIRTIGNIMI 
WRERIWQNS 
FFIIYIIIVA 
KNPYQYKFVJY 
EKVLKVIAFK 



21 
I 

ANEANYARGT 
SQRKRQQYAK 
VALAIYIPFP 
DFVIVIVGLF 
LNSIIKAMVP 
GNGRQCTANG 
WPWVYFVSLI 
WITQAEDIDP 
AAKAGPSGCR 
SSEHYNQPDW 
ETILVELEIM 
IIIFSLLGMQ 
PSSSGMIVCI 
ESLENKKNNK 
AGPRPRRISE 
AALAAEDPIR 
LWGVSLVSF 
VTTLLQFMFA 
DFNFDNVLSA 
FFMMNIFVGF 
WNSSPFEYM 
PKGYFSDAWN 



31 
I 

RLPLSGEGPT 
SKKQGKSSNS 
EDDSNSTNHN 
SVILEQLTKE 
LLHIALLtVLF 
TECRSGWVGP 
ILGSFFVLNL 
ENEEEGGEEG 
RWGQAISKSK 
LTQIQDIANK 
SPLGISVFRC 
LFGGKFNFDE 
YFIILFICGN 
PEVNQIANSD 
LNMKEKIAPI 
SHSFRNTILG 
GIQSSAISW 
CIGVQLFKGK 
MMALFTVSTF 
VIVTFQEQGE 
MFVLIMLNTL 
TFDSLIVIGS 



41 

! 

SQPNSSKQTV 
RPARALFCLS 
LEKVEYAFLI 
TEGGNHSSGK 
VIIIYAIIGL 
NGGITNFDNF 
VLGVLSGEFS 
KRNTSMPTSE 
LSRRWRRWNR 
VLLALFTCEM 
VRLLRIFKVT 
TQTKRSTFDN 
YILLNVFLAI 
NKVTIDDYRE 
PEGSAFFILS 
YFDYAFTAIF 
KILRVLRVLR 
FYRCTDEAKS 
EGWPALLYKA 
KEYKNCELDK 
CLAMQHYEQS 
IIDVAIiSEAD 



51 

I 

LSWQAAIDAA 
LNNPIRRACI 
IFTVETFLKI 
SGGFDVKALR 
ELFIGKMHKT 
AFAMLTVFQC 
K ERE KAKARG 
TESVNTENVS 
FNRRRCRAAV 
LVKMYSLGLQ 
RHWTSLSNLV 
FPQALLTVFQ 
AVDNLADAES 
EDEDKDPYPP 
KTNPIRVGCH 
TVEILLKMTT 
PLRAINRAKG 
NPEECRGLFI 
IDSNGENIGP 
NQRQCVEYAL 
KMFNDAMDIL 
PTESENVPVP 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
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TATPGNSEES NRISITFFRL FRVHRLVKLL 
FFIYAVIGMQ MFGKVAMRDN NQINRNNNFQ 
CDPESDYNFG EEYTCGSNFA IVYFISFYML 
HLDEFKRIWS EYDPEAKGRI KHLDWTLLR 
DGTVMFNATL FALVRTALKI KTEGNLEQAN 
VTVGKFYATF LIQDYFRKFK KRKEQGLVGK 
DLQDDEPEET KREEEDDVFK RNGALLGNHV 
ASDTEKPLFP PAGNSVCHNH HNHNSIGKQV 
SENGHHSSHK HDREPQRRSS VKRTRYYETY 
GEQEYFSSEE CYEDDSSPTW SRQNYGYYSR 
DSRRSPRRRL LPPTPASHRR SSFNFECLRR 
VAGLDSSKAQ KYSPSHSTRS WATPPATPPY 
RSSWYTDEPD ISYRTFTPAS LTVPSSFRNK 
VSATKHEIAD ACDLTIDEME SAASTLLNGN 
EPDPGRDEED LADEMICITT L 



Nucleic Acid Accession*: NM_002812 
Coding sequence: 1 50-3362 (underlined 



SRGEGIRTLL WTFIKSFQAL PYVALLIAML 1380 

TFPQAVLLLF RCATGEAWQE IMLACLPGKL 1440 

CAFLIINLFV AVIMDNFDYL TRDWSILGPH 1500 

RIQPPLGFGK LCPHRVACKR LVAMNMPLNS 1560 

EELRAVIKKI WKKTSMKLLD QWPPAGDDE 1620 

YPAKNTTIAL QAGLRTLHDI GPEIRRAISC 1680 

NHVNSDRRDS LQQTNTTHRP LHVQRPSIPP 1740 

PTSTNANLNN ANMSKAAHGK RPSIGNLEHV 1800 

1RSDSGDEQL PTICREDPEI HGYFRDPHCL 1860 

YPGRNIDSER PRGYHHPQGF LEDDDSPVCY 1920 

QSSQEEVPSS PIFPHRTALP LHLMQQQIMA 1980 

RDWTPCYTPL IQVEQSEALD QVNGSLPSLH 2040 

NSDKQRSADS LVEAVL I S EG LGRYARDPKF 2100 

VRPRANGDVG PLSHRQDYEL QDFGPGYSDE 2160 



SEQ 10 NO:289 0B16 DNA SEQUENCE 
corresponds to start and stop cod on) 



1 11 21 31 41 51 

I I t t i I 

AACTCCCGCC TCGGGACGCC TCGGGGTCGG GCTCCGGCTG CGGCTGCTGC TGCGGCGCCC 60 

GCGCTCCGGT GCGTCCGCCT CCTGTGCCCG CCGCGGAGCA GTCTGCGGCC CGCCGTGCGC 120 

CCTCAGCTCC TTTTCCTGAG CCCGCCGCGA TGGGAGCTGC GCGGGGATCC CCGGCCAGAC 180 

CCCGCCGGTT GCCTCTGCTC AGCGTCCTGC TGCTGCCGCT GCTGGGCGGT ACCCAGACAG 240 

CCATTGTC TT CATCAAGCAG CCGTCCTCCC AGGATGCACT GCAGGGGCGC CGGGCGCTGC 300 

TTCGCTGTGA GGTTGAGGCT CCGGGCCCGG TACATGTGTA CTGGCTGCTC GATGGGGCCC 360 

CTGTCCAGGA CACGGAGCGG CGTTTCGCCC AGGGCAGCAG CCTGAGCTTT GCAGCTGTGG 420 

ACCGGCTGCA GGACTCTGGC ACCTTCCAGT GTGTGGCTCG GGATGATGTC ACTGGAGAAG 480 

AAGCCCGCAG TGCCAACGCC TCCTTCAACA TCAAATGGAT TGAGGCAGGT CCTGTGGTCC 540 

TGAAGCATCC AGCCTCGGAA GCTGAGATCC AGCCACAGAC CCAGGTCACA CTTCGTTGCC 600 

ACATTGATGG GCACCCTCGG CCCACCTACC AATGGTTCCG AGATGGGACC CCCCTTTCTG 660 

ATGGTCAGAG CAACCACACA GTCAGCAGCA AGGAGCGGAA CCTGACGCTC CGGCCAGCTG 720 

GTCCTGAGCA TAGTGGGCTG TATTCCTGCT GCGCCCACAG TGCTTTTGGC CAGGCTTGCA 780 

GCAGCCAGAA CTTCACCTTG AGCATTGCTG ATGAAAGCTT TGCCAGGGTG GTGCTGGCAC 840 

CCCAGGACGT GGTAGTAGCG AGGTATGAGG AGGCCATGTT CCATTGCCAG TTCTCAGCCC 900 

AGCCACCCCC GAGCCTGCAG TGGCTCTXTG AGGATGAGAC TCCCATCACT AACCGCAGTC 960 

GCCCCCCACA CCTCCGCAGA GCCACAGTGT TTGCCAACGG GTCTCTGCTG CTGACCCAGG 102 0 

TCCGGCCACG CAATGCAGGG ATCTACCGCT GCATTGGCCA GGGGCAGAGG GGCCCACCCA 1080 

TCATCCTGGA AGCCACACTT CACCTAGCAG AGATTGAAGA CATGCCGCTA TTTGAGCCAC 1140 

GGGTGTTTAC AGCTGGCAGC GAGGAGCGTG TGACCTGCCT TCCCCCCAAG GGTCTGCCAG 1200 

AGCCCAGCGT GTGGTGGGAG CACGCGGGAG TCCGGCTGCC CACCCATGGC AGGGTCTACC 12 60 

AGAAGGGCCA CGAGCTGGTG TTGGCCAATA TTGCTGAAAG TGATGCTGGT GTCTACACCT 1320 

GCCACGCGGC CAACCTGGCT GGTCAGCGGA GACAGGATGT CAACATCACT GTGGCCACTG 1380 

TGCCCTCCTG GCTGAAGAAG CCCCAAGACA GCCAGCTGGA GGAGGGCAAA CCCGGCTACT 1440 

TGGATTGCCT GACCCAGGCC ACACCAAAAC CTACAGTTGT CTGGTACAGA AACCAGATGC 1500 

TCATCTCAGA GGACTCACGG TTCGAGGTCT TCAAGAATGG GACCTTGCGC ATCAACAGCG 1560 

TGGAGGTGTA TGATGGGACA TGGTACCGTT GTATGAGCAG CACCCCAGCC GGCAGCATCG 1620 

AGGCGCAAGC CCGTGTCCAA GTGC TGG AAA AGCTCAAGTT CACACCACCA CCCCAGCCAC 1680 

AGCAGTGCAT GGAGTTTGAC AAGGAGGCCA CGGTGCCCTG TTCAGCCACA GGCCGAGAGA 1740 

AGCCCACTAT TAAGTGGGAA CGGGCAGATG GGAGCAGCCT CCCAGAGTGG GTGACAGACA 1800 

ACGCTGGGAC CCTGCATTTT GCCCGGGTGA CTCGAGATGA CGCTGGCAAC TACACTTGCA 1860 

TTGCCTCCAA CGGGCCGCAG GGCCAGATTC GTGCCCATGT CCAGCTCACT GTGGCAGTTT 1920 

TTATCACCTT CAAAGTGGAA CCAGAGCGTA CGACTGTGTA CCAGGGCCAC ACAGCCCTAC 1980 

TGCAGTGCGA GGCCCAGGGG GACCCCAAGC CGCTGATTCA GTGGAAAGGC AAGGACCGCA 2040 

TCCTGGACCC CACCAAGCTG GGACCCAGGA TGCACATCTT CCAGAATGGC TCCCTGGTGA 2100 

TCCATGACGT GGCCCCTGAG GACTCAGGCC GCTACACCTG CATTGCAGGC AACAGCTGCA 2160 

ACATCAAGCA CACGGAGGCC CCCCTCTATG TCGTGGACAA GCCTGTGCCG GAGGAGTCGG 2220 

AGGGCCCTGG CAGCCCTCCC CCCTACAAGA TGATCCAGAC CATTGGGTTG TCGGTGGGTG 2280 

CCGCTGTGGC CTACATCATT GCCGTGCTGG GCCTCATGTT CTACTGCAAG AAGCGCTGCA 2340 

AAGCCAAGCG GCTGCAGAAG CAGCCCGAGG GCGAGGAGCC AGAGATGGAA TGCCTCAACG 2400 

GAGGGCCTTT GCAGAACGGG CAGCCCTCAG CAGAGATCCA AGAAGAAGTG GCCTTGACCA 2460 

GCTTGGGCTC CGGCCCCGCG GCCACCAACA AACGCCACAG CACAAGTGAT AAGATGCACT 2520 

TCCCACGGTC TAGCCTGCAG CCCATCACCA CGCTGGGGAA GAGTGAGTTT GGGGAGGTGT 2580 

TCCTGGCAAA GGCTCAGGGC TTGGAGGAGG GAGTGGCAGA GACCCTGGTA CTTGTGAAGA 2640 

GCCTGCAGAC GAAGGATGAG CAGCAGCAGC TGGACTTCCG GAGGGAGTTG GAGATGTTTG 2700 

GGAAGCTGAA CCACGCCAAC GTGGTGC GGC TCCTGGGGCT GTGCCGGGAG GCTGAGCCCC 2760 

ACTACATGGT GCTGGAATAT GTGGATCTGG GAGACCTCAA GCAGTTCCTG AGGATTTCCA 2820 

AGAGCAAGGA TGAAAAATTG AAGTCACAGC CCCTCAGCAC CAAGCAGAAG GTGGCCCTAT 2880 

GCACCCAGGT AGCCCTGGGC ATGGAGCACC TGTCCAACAA CCGCTTTGTG CATAAGGACT 2940 

TGGCTGCGCG TAACTGCCTG GTCAGTGCCC AGAGACAAGT GAAGGTGTCT GCCCTGGGCC 3000 

TCAGCAAGGA TGTGTACAAC AGTGAGTACT ACCACTTCCG CCAGGCCTGG GTGCCGCTGC 3060 

GCTGGATGTC CCCCGAGGCC ATCCTGGAGG GTGACTTCTC TACCAAGTCT GATGTCTGGG 3120 

CCTTCGGTGT GCTGATGTGG GAAGTGTTTA CACATGGAGA GATGCCCCAT GGTGGGCAGG 3180 

CAGATGATGA AGTACTGGCA GATTTGCAGG CTGGGAAGGC TAGACTTCCT CAGCCCGAGG 3240 

GCTGCCCTTC CAAACTCTAT CGGCTGATGC AGCGCTGCTG GGCCCTCAGC CCCAAGGACC 3300 

GGCCCTCCTT CAGTGAGATT GCCAGCGCCC TGGGAGACAG CACCGTGGAC AGCAAGCCGT 3360 

GAGGAGGGAG CCCGCTCAGG ATGGCCTGGG CAGGGGAGGA CATCTCTAGA GGGAAGCTCA 3420 

424 
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CAGCATGATG GGCAAGATCC CTGTCCTCCT GGGCCCTGAG GTGCCCTAGT GCAACAGGCA 3480 

TTGCTGAGGT CTGAGCAGGG CCTGGCCTTT CCTCCTCTTC CTCACCCTCA TCCTTTGGGA 3540 

GGCTGACTTG GACCCAAACT GGGCGACTAG GGCTTTGAGC TGGGCAGTTT CCCCTGCCAC 3600 

CTCTTCCTCT ATCAGGGACA GTGTGGGTGC CACAGGTAAC CCCAATTTCT GGCCTTCAAC 3660 

5 TTCTCCCCTT GACCGGGTCC AACTCTGCCA CTCATCTGCC AACTTTGCCT GGGGAGGGCT 3720 

AGGCTTGGGA TGAGCTGGGT TTGTGGGGAG TTCCTTAATA TTCTCAAGTT CTGGGCACAC 3780 

AGGGTTAATG AGTCTCTTGC CCACTGGTCC ACTTGGGGGT CTAGACCAGG ATTATAGAGG 3840 

ACACAGCAAG TGAGTCCTCC CCACTCTGGG CTTGTGCACA CTGACCCAGA CCCACGTCTT 3 900 

_ CCCCACCCTT CTCTCCTTTC CTCATCCTAA GTGCCTGGCA GATGAAGGAG TTTTCAGGAG 3960 

10 CTTTTGACAC TATATAAACC GCCCTTTTTG TATGCACCAC GGGCGGCTTT TATATGTAAT 4020 

TGCAGCGTGG GGTGGGTGGG CATGGGAGGT AGGGGTGGGC CCTGGAGATG AGGAGGGTGG 4080 

GCCATCCTTA CCCCACACTT TTATTGTTGT CGTTTTTTGT TTGTTTTGTT TTTTTGTTTT 4140 

TGTTTTTGTT TTTACACTCG CTGCTCTCAA TAAATAAGCC TTTTTTA 
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j 40 

50 
55 
60 
65 
70 
75 
80 



SEQ ID NO:290 OBIS Protein sequence: 
Protein Accession t: NP_00281 2 



1 11 21 31 41 51 

20 1 1 j 1 i 1 

MGAARGS PAR PRRLPLLSVL LLPLLGGTQT AIVFIKQPSS QDALQGRRAL LRCEVEAPGP 60 

VHVYWLLDGA FVQDTERRFA QGSSLSFAAV DRLQDSGTFQ CVARDDVTGE EARS ANAS FN 120 

IKWIEAGPW LKHPASEAEI QPQTQVTLRC HIDGHPRPTY QWFRDGTPLS DGQSNHTVSS 180 

KERNLTLRPA GPEHSGLYSC CAHSAFGQAC SSQNFTLSIA DESFARWLA PQDVWARYE 240 

t;25 EAMFHCQFSA QPPPSLQWLF EDETPITNRS RPPHLRRATV FANGSLLLTQ VRPRNAGIYR 300 

"j CIGQGQRGPP IILEATLHLA EIEDMPLFEP RVFTAGSEER VTCLPPKGLP EPSVWWEHAG 360 

« VRLPTHGRVY QKGHELVLAN IAESDAGVYT CHAANLAGQR RQDVNITVAT VPSWLKKPQD 420 

«* SQLEEGKPGY LDCLTQATPK PTWWYRNQM LISEDSRFEV FKNGTLRINS VEVYDGTWYR 480 

"'OA CMSSTPAGSI EAQARVQVLE KLKFTPPPQP QQCMEFDKEA TVPCSATGRE KPTIKWERAD 540 

230 GSSLPEWVTD NAGTLHFARV TRDDAGNYTC IASNGPQGQI RAHVQLTVAV FITFKVEPER 600 

TTVYQGHTAL LQCEAQGDPK PLIQWKGKDR ILDPTKLGPR MHIFQNGSLV IHDVAPEDSG 660 

RYTCIAGNSC NIKHTEAPLY WDKPVPEES EGPGSPPPYK MIQTIGLSVG AAVAYIIAVL 720 

Z GLMFYCKKRC KAKRLQKQPE GEEPEMECLN GGPLQNGQPS AEIQEEVALT SLGSGPAATN 780 

~: KRHSTSDKMH FPRSSLQPIT TLGKSEFGEV FLAKAQGLEE GVAETLVLVK SLQTKDEQQQ 840 

-35 LDFRRELEMF GKLNHANWR LLGLCREAEP HYMVLEYVDL GDLKQFLRIS KSKDEKLKSQ 900 

* PLSTKQKVAL CTQVALGMEH LSNNRFVHKD LAARNCLVSA QRCVKVSALG LSKDVYNSEY 960 

YHFRQAWVPL RWMSPEAILE GDFSTKSDVW AFGVLMWEVF THGEMPHGGQ ADDEVLADLQ 1020 
AGKARLFQPE GCPSKLYRUl ORCWALSPKD RPSFSEIASA LGDSTVDSRP 



SEQ ID NO:291 AAB1 DNA SEQUENCE 

Nucleic Acid Accession #: NMJKJ2205 

Coding sequence: 1-3150 (underlined sequences correspond to start and stop codons) 

1 11 21 31 41 51 

I ! I I i t 

ATG GGGAGCC GG AC GC C AG A GTCCCCTCTC CACGCCGTGC AGCTGCGCTG GGGCCCCCGG 60 

CGCCGACCCC CGCTSSTGCC GCTGCTGTTG CTGCTSSTGC CGCCGCCACC CAGGGTCGGG 120 

GGCTTCAACT TAGACGCGGA GGCCCCAGCA GTACTCTCGG GGCCCCCGGG CTCCTTCTTC 180 

GGATTCTCAG TGGAGTTTTA CCGGCCGGGA ACAGACGGGG TCAGTGTGCT GGTGGGAGCA 240 

CCCAAGGCTA ATACCAGCCA GCCAGGAGTG CTGCAGGGTG GTGCTGTC T A CCTCTGTCCT 300 

TGGGGTGCCA GCCCCACACA GTGCACCCCC ATTGAATTTG ACAGCAAAGG CTCTCGGCTC 360 

CTGGAGTCCT CACTGTCCAG CTCAGAGGGA GAGGAGCCTG TGGAGTACAA GTCCTTGCAG 42 0 

TGGTTCGGGG CAACAGTTCG AGCCCATGGC TCCTCCATCT TGGCATGCGC TCCACTGTAC 480 

AGCTGGCGCA CAGAGAAGGA GCCACTGAGC GACCCCGTGG GCACCTGCTA CCTCTCCACA 540 

GATAACTTCA CCCGAATTCT GGAGTATGCA CCCTGCCGCT CAGATTTCAG CTGGGCAGCA 600 

GGACAGGGTT ACTGCCAAGG AGGCTTCAGT GCCGAGTTCA CCAAGACTGG CCGTGTGGTT 660 

TTAGGTGGAC CAGGAAGCTA TTTCTGGCAA GGCCAGATCC TGTCTGCCAC TCAGGAGCAG 720 

ATTGCAGAAT CTTATTACCC CGAGTACCTG ATCAACCTGG TTCAGGGGCA GCTGCAGACT 780 

CGCCAGGCCA GTTCCATCTA TGATGACAGC TACCTAGGAT ACTCTGTGGC TGTTGGTGAA 840 

TTCAGTGGTG ATGACACAGA AGACTTTGTT GCTGGTGTGC CCAAAGGGAA CCTCACTTAC 900 

GGCTATGTCA CCATCCTTAA TGGCTCAGAC ATTCGATCCC TCTACAACTT CTCAGGGGAA 960 

CAGATGGCCT CCTACTTTGG CTATGCAGTG GCCGCCACAG ACGTCAATGG GGACGGGCTG 1020 

GATGACTTGC TGGTGGGGGC ACCCCTGCTC ATGGATCGGA CCCCTGACGG GCGGCCTCAG 1080 

GAGGTGGGCA GGGTCTACGT CTACCTGCAG CACCCAGCCG GCATAGAGCC CACGCCCACC 1140 

CTTACCCTCA CTGGCCATGA TGAGTTTGGC CGATTTGGCA GCTCCTTGAC CCCCCTGGGG 1200 

GACCTGGACC AGGATGGCTA CAATGATGTG GCCATCGGGG CTCCCTTTGG TGGGGAGACC 1260 

CAGCAGGGAG TAGTGTTTGT ATTTCCTGGG GGCCCAGGAG GGCTGGGCTC TAAGCCTTCC 1320 

CAGGTTCTGC AGCCCCTGTG GGCAGCCAGC CACACCCCAG ACTTCTTTGG CTCTGCCCTT 1380 

CGAGGAGGCC GAGACCTGGA TGGCAATGGA TATCCTGATC TGATTGTGGG GTCCTTTGGT 1440 

GTGGACAAGG CTGTGGTATA CAGGGGCCGC CCCATCGTGT CCGCTAGTGC CTCCCTCACC 1500 

ATCTTCCCCG CCATGTTCAA CCCAGAGGAG CGGAGCTGCA GCTTAGAGGG GAACCCTGTG 1560 

GCCTGCATCA ACCTTAGCTT CTGCCTCAAT GCTTCTGGAA AACACGTTGC TGACTCCATT 1620 

GGTTTCACAG TGGAACTTCA GCTGGACTGG CAGAAGCAGA AGGGAGGGGT ACGGCGGGCA 1680 

CTGTTCCTGG CCTCCAGGCA GGCAACCCTG ACCCAGACCC TGCTCATCCA GAATGGGGCT 1740 

CGAGAGGATT GCAGAGAGAT GAAGATCTAC CTCAGGAACG AGTCAGAATT TCGAGACAAA 1800 

CTCTCGCCGA TTCACATCGC TCTCAACTTC TCCTTGGACC CCCAAGCCCC AGTGGACAGC 1860 

CACGGCCTCA GGCCAGCCCT ACATTATCAG AGCAAGAGCC GGATAGAGGA CAAGGCTCAG 1920 

ATCTTGCTGG ACTGTGGAGA AGACAACATC TGTGTGCCTG ACCTGCAGCT GGAAGTGTTT 1980 
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GGGGAGCAGA ACCATGTGTA CCTGGGTGAC AAGAATGCCC TGAACCTCAC TTTCCATGCC 2040 

CAGAATGTGG GTGAGGGTGG CGCCTATGAG GCTGAGCTTC GGGTCACCGC CCCTCCAGAG 2100 

GCTGAGTACT CAGGACTCGT CAGACACCCA GGGAACTTCT CCAGCCTGAG CTGTGACTAC 2160 

TTTGCCGTGA ACCAGAGCCG CCTGCTGGTG TGTGACCTGG GCAACCCCAT GAAGGCAGGA 2220 

GCCAGTCTGT GGGGTGGCCT TCGGTTTACA GTCCCTCATC TCCGGGACAC TAAGAAAACC 2280 

ATCCAGTTTG ACTTCCAGAT CCTCAGCAAG AATCTCAACA ACTCGCAAAG CGACGTGGTT 2340 

TCCTTTCGGC TCTCCGTGGA GGCTCAGGCC CAGGTCACCC TGAACGGTGT CTCCAAGCCT 2400 

GAGGCAGTGC TATTCCCAGT AAGCGACTGG CATCCCCGAG ACCAGCCTCA GAAGGAGGAG 2460 

GACCTGGGAC CTGCTGTCCA CCATGTCTAT GAGCTCATCA ACCAAGGCCC CAGCTCCATT 2520 

AGCCAGGGTG TGCTGGAACT CAGCTGTCCC CAGGCTCTGG AAGGTCAGCA GCTCCTATAT 2580 

GTGACCAGAG TTACGGGACT CAACTGCACC ACCAATCACC CCATTAACCC AAAGGGCCTG 2640 

GAGTTGGATC CCGAGGGTTC CCTGCACCAC CAGCAAAAAC GGGAAGCTCC AAGCCGCAGC 2700 

TCTGCTTCCT CGGGACCTCA GATCCTGAAA TGCCCGGAGG CTGAGTGTTT CAGGCTGCGC 2760 

TGTGAGCTCG GGCCCCTGCA CCAACAAGAG AGCCAAAGTC TGCAGTTGCA TTTCCGAGTC 2820 

TGGGCCAAGA CTTTCTTGCA GCGGGAGCAC CAGCCATTTA GCCTGCAGTG TGAGGCTGTG 2880 

TACAAAGCCC TGAAGATGCC CTACCGAATC CTGCCTCGGC AGCTGCCCCA AAAAGAGCGT 2940 

CAGGTGGCCA CAGCTGTGCA ATGGACCAAG GCAGAAGGCA GCTATGGCGT CCCACTGTGG 3000 

ATCATCATCC TAGCCATCCT GTTTGGCCTC CTGCTCCTAG GTCTACTCAT CTACATCCTC 3060 

TACAAGCTTG GATTCTTCAA ACGCTCCCTC CCATATGGCA CCGCCATGGA AAAAGCTCAG 3120 
CTCAAGCCTC CAGCCACCTC TGATGCCTGA 



SEQ IP NO:292 AAB1 Protein sequence: 
Protein Accession #: NP JJ021 96 

1 11 21 31 41 51 

ill!!! 

MGSRTPESPL HAVQLRWGPR RRPPLLPLLL LLLPPPPRVG GFNLDAEAPA VLSGPPGSFF 60 

GFSVEFYRPG TDGVSVLVGA PKANTSQPGV LQGGAVYLCP WGASPTQCTP IEFDSKGSRL 120 

LESSLSSSEG EEFVEYKSLQ WFGATVRAHG SSILACAPLY SWRTEKEFLS DPVGTCYLST 180 

DNFTRILEYA PCRSDFSWAA GQGYCQGGFS AEFTKTGRW LGGPGSYFWQ GQILSATQEQ 240 

IAESYYPEYL INLVQGQLQT RQASSIYDDS YLGYSVAVGE FSGDDTEDFV AGVPKGNLTY 300 

GYVTILNGSD IRSLYNFSGE QMASYFGYAV AATDVNGDGL DDLLVGAPLL MDRTPDGRPQ 360 

EVGRVYVYLQ HPAGIEPTPT I/TLTGHDEFG RFGSSLTPLG DLDQDGYNDV AIGAPFGGET 420 

QQGWFVFPG GPGGLGSKPS QVLQPLWAAS HTPDFFGSAL RGGRDLDGNG YPDLIVGSFG 480 

VDKAWYRGR PIVSASASLT IFPAMFNPEE RSCSLEGNPV ACINLSFCliN ASGKHVADSI 540 

GFTVELQLDW QKQKGGVRRA LFLASRQATL TQTLLIQNGA REDCREMKIY LRNESEFRDK 600 

LSPIHIALNF SLDPQAPVDS HGLRPALHYQ SKSRIEDKAQ ILLDCGEDNI CVPDLQLEVF 660 

GEQNHVYLGD KNALNLTFHA QNVGEGGAYE AELRVTAPPE AEYSGLVRHP GNFSSLSCDY 720 

FAVNQSRLLV CDLGNPMKAG ASLWGGLRFT VFHLRDTKKT IQFDFQILSK NLNNSQSDW 780 

SFRLSVEAQA QVTLNGVSKP EAVLFPVSDW HPRDQPQKEE DLGPAVHHVY ELINQGPSSI 840 

SQGVLELSCP QALEGQQLLY VTRVTGLNCT TNHPINPKGL ELDPEGSLHH QQKREAPSRS 900 

SASSGPQILK CPEAECFRLR CELGPLHQQE SQSLQLHFRV WAKTFLQREH QPFSLQCEAV 960 

YKALKMPYRI LPRQLPQKER QVATAVQWTK AEGSYGVPLW IIILAILFGL LLLGLLIYIL 1020 
YKLGFFKRSL PYGTAMEKAQ LKPPATSDA 



SEQ (D NO:293 LBH4 DNA SEQUENCE 

Nucleic Acid Accession #: BC001291 

Coding sequence: 44-541 (start and stop codons are underlined) 



1 U 21 31 41 51 
I 1 I I ! I 

GGGGGCGCCG CGCGCTGACC CTCCCTGGGC ACCGCTGGGG ACGATGGCGC TGCTCGCCTT 60 
GCTGCTGGTC GTGGCCCTAC CGCGGGTGTG GACAGACGCC AACCTGACTG CGAGACAACG 120 
AG ATCCAG AG GACTCCCAGC GA ACGGACGA GGGTGACAAT AGAGTGTGGT GTCATGTTTG 1 80 
TGAGAGAGAA AACACTTTCG AGTGCCAGAA CCCAAGGAGG TGCAAATGGA CAGAGCCATA 240 
CTGCGTTATA GCGGCCGTGA AAATATTTCC ACGTTTTTTC ATGGTTGCGA AGCAGTGCTC 300 
CGCTGGTTGT GCAGCGATGG AGAGACCCAA GCCAG AGGAG AAGCGGTTTC TCCTGG AAG A 360 
GCCCATGCCC TTCTTTTACC TCAAGTGTTG TAAAATTCGC TACTGCAATT TAGAGGGGCC 420 
ACCTATCAAC TCATCAGTGT TCAAAGAATA TGCTGGGAGC ATGGGTGAGA GCTGTGGTGG 480 
GCTGTGGCTG GCCATCCTCC TGCTGCTGGC CTCCATTGCA GCCGGCCTCA GCCTGTCTTG 540 
AGCCACGGGA CTGCCACAGA CTGAGCCTTC CGGAGCATGG ACTCGCTCCA GACCGTTGTC 600 
ACCTGTTGCA TTAAACTTGT TTTCTGTTGA TTACCTCTTG GTTTGACTTC CCAGGGTCTT 660 
GGGATGGGAG AGTGGGGATC AGGTGCAGTT GGCTCTTAAC CCTCAAGGGT TCTTTAACTC 720 
ACATTC AG AG GAAGTCC AG A TCTCCTG AGT AGTGATTTTG GTGACAAGTT TTTCTCTTTG 780 
AAATCAAACC TTGTAACTCA TTTATTGCTG ATGGCCACTC TTTTCCTTGA CTCCCCTCTG 840 
CCTCTGAGGG CTTCAGTATT GATGGGGAGG GAGGCCTAAG TACCACTCAT GGAGAGTATG 900 
TGCTGAGATG CTTCCGACCT TTCAGGTGAC GCAGGAACAC TGGGGGAGTC TGAATGATTG 960 
GGGTG AAGAC ATCCCTGGAG TGAAGGACTC CTCAGCATGG GGGGCAGTGG GGCACACGTT 1020 
AGGGCTGCCC CCATTCCAGT GGTGG AGGCG CTGTGGATGG CTGCTTTTCC TCAACCTTTC 1080 
CTACCAGATT CCAGGAGGCA GAAGATAACT AATTGTGTTG AAGAAACTTA GACTTCACCC 1 140 
ACCAGCTGGC ACAGGTGCAC AGATTCATAA ATTCCCACAC GTGTGTGTTC AACATCTGAA 1200 
ACTTAGGCCA AGTAGAGAGC ATCAGGGTAA ATGGCGTTCA TTTCTCTGTT AAGATGCAGC 1260 
CATCCATGGG GAGCTGAGAA ATCAGACTCA AAGTTCCACC AAAAACAAAT ACAAGGGGAC 1320 
TTCAAAAGTT CACGAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAA 
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SEQ ID NO:294 LBH4 Protein sequence: 
Protein Accession #: AAH01291 



5 1 11 21 31 41 51 
I 1 I 1 I 1 

MALLALLLVV ALPRVWTDAN LTARQRDPED SQRTDEGDNR VWCHVCEREN TFECQNPRRC 60 
KWTEPYCVIA AVKIFPRFFM VAKQCS AGC A AMERPKPEEK RFLLEEPMPF FYLKCCKJRY 1 20 
CNLEGPPfNS SVFKEYAGSM GESCGGLWLA ILLLLASIAA GLSLS 
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It is understood that the examples described above in no way serve to limit the 
true scope of this invention, but rather are presented for illustrative purposes. All 
publications, sequences of accession numbers, and patent applications cited in this 
specification are herein incorporated by reference as if each individual publication or patent 
application were specifically and individually indicated to be incorporated by reference. 
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